pmf and case-sensitive matching #998

fgsch · 2018-01-22T16:13:44Z

After c1a7652, all @pmf occurrences have the lowercase transformation.

IMO this was not the right way to fix it since it implies @pmf is no longer case insensitive as the documentation states.
Furthermore, although disabled there is code in the Aho-Cora implementation to support UTF8.
If this ever gets enabled and rules start depending on it, having lowercase alone would not be enough.

We should either revert this or update the documentation (and code) to treat the input as case-sensitive.

The text was updated successfully, but these errors were encountered:

dune73 · 2018-01-25T16:50:45Z

I am not quite following you; too many negations for my brain.

If I remember by correctly, we made this adjustment to make sure we can work with case sensitive and case insensitive. All data files are supposed to be lowercase, all payloads transformed to lowercase.

Do you understand how UTF-8 plays into this? Could you explain.

Nothing wrong with updating the documentation and making this all transparent, though.

fgsch · 2018-02-19T21:46:21Z

Apologies for the delay on this.

There are a number of issues IMO with this change:

We're doing the conversion twice (one in Aho-Corasick and the other in the transformation).
The A-C code will lowercase one letter at the time; the lowercase transformation will do the whole string. This has performance implications, in particular for large inputs.
There is a mismatch between the rules and the documentation; the documentation says pmf is case insensitive but the rules add the lowercase transformation, implying otherwise.
Because of number 3, other implementations following the description of pmf will end up doing the conversion twice.
The A-C code supports UTF-8, albeit disabled by default. The lowercase transformation does not support UTF-8. An implementation depending on the lowercase transformation will not behave the same iff UTF-8 is enabled.

Hope that is better but let me know otherwise.

github-actions · 2019-11-20T00:03:48Z

This issue has been open 120 days with no activity. Remove the stale label or comment, or this will be closed in 14 days

fgsch · 2019-11-20T00:06:29Z

@dune73 ping.

dune73 · 2020-01-08T16:03:45Z

OK. I got you now.

We are based on an engine that defines @pmf as case-insensitive, yet we act as if it was sensitive in order to comfort implementations that do not follow the definition and implement a case sensitive A-C algorithm. Consequently users of compliant implementations suffer the performance penalty of an unnecessary t:lowercase transformation. And on top of it, we're likely to face UTF-8 problems with out transformation.

Correct?

If yes, it would indeed make sense to roll back the change.

It would be nice, if @p0pr0ck5 could chime in, before we perform the roll back, though.

dune73 · 2020-03-04T08:03:35Z

This was on the agenda for the community chat on March 2, 2020. But we postponed the discussion.

#1683 (comment)

github-actions bot added the Stale issue This issue has been open 120 days with no activity. label Nov 20, 2019

fgsch removed the Stale issue This issue has been open 120 days with no activity. label Nov 20, 2019

fgsch added the Meeting Agenda label Dec 4, 2019

dune73 mentioned this issue Jan 6, 2020

Monthly Chat Agenda January (2020-01-06) #1654

Closed

dune73 mentioned this issue Mar 2, 2020

Monthly Chat Agenda March (2020-03-02) #1683

Closed

This was referenced May 13, 2020

pmf and case-sensitive matching coreruleset/coreruleset#998

Closed

Monthly Chat Agenda January (2020-01-06) coreruleset/coreruleset#1654

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pmf and case-sensitive matching #998

pmf and case-sensitive matching #998

fgsch commented Jan 22, 2018

dune73 commented Jan 25, 2018

fgsch commented Feb 19, 2018

github-actions bot commented Nov 20, 2019

fgsch commented Nov 20, 2019

dune73 commented Jan 8, 2020 •

edited

Loading

dune73 commented Mar 4, 2020

pmf and case-sensitive matching #998

pmf and case-sensitive matching #998

Comments

fgsch commented Jan 22, 2018

dune73 commented Jan 25, 2018

fgsch commented Feb 19, 2018

github-actions bot commented Nov 20, 2019

fgsch commented Nov 20, 2019

dune73 commented Jan 8, 2020 • edited Loading

dune73 commented Mar 4, 2020

dune73 commented Jan 8, 2020 •

edited

Loading