Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No Constraint on Version Names Can Cause Issues #12483

Open
RobertRosca opened this issue Nov 4, 2022 · 4 comments
Open

No Constraint on Version Names Can Cause Issues #12483

RobertRosca opened this issue Nov 4, 2022 · 4 comments
Labels
feature request needs discussion a product management/policy issue maintainers and users should discuss

Comments

@RobertRosca
Copy link

RobertRosca commented Nov 4, 2022

Describe the bug

There's no (or a very high) limit on the name provided for a version of a package, for example this package https://pypi.org/project/uselesscapitalquiz/ has a version name which is 218 characters long.

Depending on the OS and file system you can hit file name length limits, causing issues with mirroring PyPI or with installation. See pypa/bandersnatch#1200, pypa/bandersnatch#1228

Expected behavior

There should be a limit in place on the length of the version name to avoid this from happening, either on accident as it seems to be with uselesscapitalquiz or on purpose to cause issues on users systems.

To Reproduce

N/A

My Platform

N/A

Additional context

I'm happy to work on a PR limiting the length of the version name, if that's an approved solution.

@RobertRosca RobertRosca added bug 🐛 requires triaging maintainers need to do initial inspection of issue labels Nov 4, 2022
@ewdurbin
Copy link
Member

ewdurbin commented Nov 4, 2022

PEP 440 does not specify a length constraint for version identifiers, which I think would need to happen before PyPI enforced such a limit. A practical limit is a good idea though.

@ewdurbin ewdurbin added needs discussion a product management/policy issue maintainers and users should discuss feature request and removed requires triaging maintainers need to do initial inspection of issue bug 🐛 labels Nov 4, 2022
@RobertRosca
Copy link
Author

Yeah, I thought that might be the case. The section "Updating the version specification" says:

The versioning specification may be updated with clarifications without requiring a new PEP or a change to the metadata version.

Any technical changes that impact the version identification and comparison syntax and semantics would require an updated versioning scheme to be defined in a new PEP.

IMO this kind of limit wouldn't have any practical impact on existing projects, and there's no reasonable use case for having a version number in the hundreds of characters which would fit into the existing PEP 440 specification. So it's tempting to try and say that a character limit is a clarification rather than a change worthy of a whole new PEP.

I'll wait until next week to see if anybody else chimes in on the issue here and, if there are no objections, make an issue to 'clarify' PEP 440 and add in a length limit to the version number.

Out of curiosity I dug into this a bit, with google big query, for all packages in the-psf.pypi.distribution_metadata the summary is:

count    7.902727e+06
mean     6.598360e+00
std      3.552846e+00
min      1.000000e+00
25%      5.000000e+00
50%      5.000000e+00
75%      7.000000e+00
99%      2.200000e+01
max      2.350000e+02

Out of 7,902,727 published package versions there are:

  • 337,624 over 16 characters
  • 697 over 32 characters
  • 407 over 64 characters

It's kind of surprising to me that hundreds of releases have such long versions 😕 either way, overall 99.991% of versions have less than or equal to 32 characters.

There's actually a discussion about this on semver semver/semver#304 but I don't think any limit was set in the specification, although practically there is a limit as major/minor/patch get parsed as integers and JS' max safe integer is 9007199254740991, which in total means the max string length is 50 characters for node js.

@dstufft
Copy link
Member

dstufft commented May 23, 2023

I think it would be reasonable for us to provide limits outside of what PEP 440 has, and in theory we technically already do since I believe you couldn't use a version number that expanded to be 100G worth of characters. It would probably be worth at least a discussion on discuss.python.org though, and digging into the releases that have longer version numbers and seeing what exactly their version numbers are and whether there is some use case we're missing.

I would also note that the same thing can happen with the project name, and also with compressed tags in a wheel filename that lead to very long filenames.

@andyhasit
Copy link

andyhasit commented Oct 10, 2023

Long paths cause problems with other parts of the ecosystem, such as poetry, which caches wheels in directories with already long names like /home/andrew/.cache/pypoetry/artifacts/07/ef/d7/f4e72ab224633e85fd96dd6c096d8c35b025ecaa3c6d7728b6d271f83b/

Resulting in errors like this:

 [Errno 36] File name too long: '/home/andrew/.cache/pypoetry/artifacts/07/ef/d7/f4e72ab224633e85fd96dd6c096d8c35b025ecaa3c6d7728b6d271f83b/SQLAlchemy-1.4.49-cp310-cp310-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl'

Which they feel is not exactly up to them to fix. Of course they could shorten part of the cache path, but then there are wheels with names over 200 chars long, like:

rgf_python-3.6.0-py2.py3-none-macosx_10_6_x86_64.macosx_10_7_x86_64.macosx_10_8_x86_64.macosx_10_9_x86_64.macosx_10_10_x86_64.macosx_10_11_x86_64.macosx_10_12_x86_64.macosx_10_13_x86_64.macosx_10_14_x86_64.whl

Which will break things regardless.

See python-poetry/poetry#8529

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request needs discussion a product management/policy issue maintainers and users should discuss
Projects
None yet
Development

No branches or pull requests

4 participants