Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-122288: Improve performances of fnmatch.translate #122289

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

picnixz
Copy link
Contributor

@picnixz picnixz commented Jul 25, 2024

This is a smaller PR compared to the one for the C implementation and is probably easier to review. Below are the benchmarks reported on the issue:

+------------------------------------------+-----------------------+-----------------------+
| Benchmark                                | fnmatch-translate-ref | fnmatch-translate-py  |
+==========================================+=======================+=======================+
| abc/[!]b-ac-z9-1]/def/\*?/*/**c/?*[][!]  | 6.09 us               | 3.99 us: 1.53x faster |
+------------------------------------------+-----------------------+-----------------------+
| !abc/[!]b-ac-z9-1]/def/\*?/*/**c/?*[][!] | 6.39 us               | 4.07 us: 1.57x faster |
+------------------------------------------+-----------------------+-----------------------+
| a**?**cd**?**??k***                      | 2.24 us               | 1.51 us: 1.49x faster |
+------------------------------------------+-----------------------+-----------------------+
| a/**/b/**/c                              | 1.97 us               | 1.12 us: 1.76x faster |
+------------------------------------------+-----------------------+-----------------------+
| man/man1/bash.1                          | 3.00 us               | 1.21 us: 2.48x faster |
+------------------------------------------+-----------------------+-----------------------+
| a*b*c*d*e*f*g*h*i*j*k*l*m*n*o**          | 5.40 us               | 3.33 us: 1.62x faster |
+------------------------------------------+-----------------------+-----------------------+
| Geometric mean                           | (ref)                 | 1.71x faster          |
+------------------------------------------+-----------------------+-----------------------+

@picnixz picnixz added performance Performance or resource usage stdlib Python modules in the Lib dir labels Jul 25, 2024
@picnixz
Copy link
Contributor Author

picnixz commented Aug 14, 2024

@barneygale Friendly ping in case you forgot you were interested in this PR.

Lib/fnmatch.py Outdated Show resolved Hide resolved
@barneygale
Copy link
Contributor

Could you add some timings to the PR description please?

Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement :)

Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/test/test_fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Show resolved Hide resolved
@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

Small optimization possibility - 1% for 1 pattern. Not sure about the others. Maybe slightly more readable too.

REP1 = repeat('\\')
REP2 = repeat(r'\\')
REP3 = repeat('-')
REP4 = repeat(r'\-')
_replace = str.replace

...

stuff = map(_replace, chunks, REP1, REP2)
stuff = map(_replace, stuff, REP3, REP4)
stuff = '-'.join(stuff)

And cache size. Not sure what it should be, are any rules for such?

Apart from this, LGTM.

@picnixz
Copy link
Contributor Author

picnixz commented Aug 23, 2024

And cache size. Not sure what it should be, are any rules for such?

I kept the same cache size. For re.escape we could have a smaller cache though if we are assuming US glyphs. We could have a cache of 512 to handle latin-1 + special characters from foreign languages (next power after 256). What do you think? I put 32k but it's probably an overkill and we could probably be safe under 4096 different glyphs, even in foreign languages such as Chinese characters.

@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

My thinking is as follows.

With 32K size, the max size of this application (with assumption that average string size is 20 characters) is:

In [39]: sys.getsizeof('a' * 20) * 2 * 32000 / 1024**2
Out[39]: 3.7   # MB

It is not much for standard machine, but is Python used on some micro platforms where that might be big? I have little experience with such and 4MB might not be an issue at all, but I would say it is worth looking into it.

I think the easiest would be to find other use cases of lru_cache in standard library and see what is maximum size limit for those. If those exist, then this could be answered easily without much effort.

Otherwise, if you can't find anything maybe someone else has any insights?

As a last resort, could make a conservative choice. My python starts with 5MB initial memory consumption. Take 5% of it, then, 2048 size would be that.

@picnixz
Copy link
Contributor Author

picnixz commented Aug 23, 2024

I think the easiest would be to find other use cases of lru_cache in standard library and see what is maximum size limit for those. If those exist, then this could be answered easily without much effort.

fnmatch._compile_pattern takes 32k as a cache size. It is also documented as such. For re.escape, the cache could be much smaller (see my comment on the glyphs) so I think taking 2048/4096 would be enough.

but is Python used on some micro platforms where that might be big?

For microcontrollers, there is MicroPython and only a subset of Python is actually available. 4MB can be quite big but I don't think they would implement LRU cache that size.

@dg-pb
Copy link
Contributor

dg-pb commented Aug 23, 2024

fnmatch._compile_pattern takes 32k as a cache size.

In this case, 32K seems fine to me. Just follow the standard and if there are any issues in the future, these can be handled as a pair.

@picnixz
Copy link
Contributor Author

picnixz commented Aug 27, 2024

@barneygale This one is the (only) remaining fnmatch PR that I decided to keep. I'd appreciate your opinion on the size of the cache for re.escape (I think we could live with maxsize=4096 instead of 32k).

Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
Lib/fnmatch.py Outdated Show resolved Hide resolved
The rationale for this change is as follows:

re.escape() is only used to cache single Unicode characters
in shell patterns; we may heuristically assume that they are
ISO-8859-1 encodable, thereby requiring a cache of size 256.
To allow non-traditional glyphs (or alphabets with a small
number of common glyphs), we double the cache size.
@picnixz
Copy link
Contributor Author

picnixz commented Oct 14, 2024

@barneygale friendly ping in case you forgot about this PR!

Copy link
Contributor

@barneygale barneygale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, nice work :)

Lib/fnmatch.py Outdated Show resolved Hide resolved
Co-authored-by: Barney Gale <[email protected]>
@picnixz
Copy link
Contributor Author

picnixz commented Oct 18, 2024

I'm committing from my phone so hopefully it'll be fine. Otherwise I'll have a look at my return next week!

Lib/test/test_fnmatch.py Outdated Show resolved Hide resolved
@picnixz
Copy link
Contributor Author

picnixz commented Oct 22, 2024

@barneygale I've also updated the tests just to remember what the indices were so it should be ready to merge now (you can take a last look since I've stared waaaay to much at this PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge performance Performance or resource usage stdlib Python modules in the Lib dir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants