Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: use case for PyPI XMLRPC API #214

Open
ewdurbin opened this issue Apr 13, 2023 · 8 comments
Open

Question: use case for PyPI XMLRPC API #214

ewdurbin opened this issue Apr 13, 2023 · 8 comments

Comments

@ewdurbin
Copy link

ewdurbin commented Apr 13, 2023

it was noticed that this uses an api for PyPI that has been in an indefinite state of "to be deprecated soon" in

class PyPIFilterFactory(IgnoreWordsFilterFactory):
"""Build an IgnoreWordsFilter for all of the names of packages on PyPI.
"""
def __init__(self):
client = xmlrpc_client.ServerProxy('https://pypi.python.org/pypi')
super().__init__(client.list_packages())

Honestly, there's a good chance that this deprecation is growing nearer and nearer as PyPI begins to explore a modern API for other use cases, as soon as we're over that hump it it is only a matter of time before XMLRPC is turned off.

So the Question: "What is the current use case? Can I as a PyPI administrator and PSF Infrastructure maintainer help the maintainers of this project find a better option?"

@dhellmann
Copy link
Member

Hi, @ewdurbin , thanks for reaching out.

We could at least disable the filter by default. I'll see if I can find some time to get to that.

@coderanger
Copy link

@dhellmann It is fortunately already off by default :)

@ewdurbin
Copy link
Author

If you wanted to re-enable it....

Without digging too deeply it looks like you just need a list of strings?

curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/ is supported way to get a JSON representation of all known project names on pypi.

@ewdurbin
Copy link
Author

@dhellmann
Copy link
Member

@dhellmann It is fortunately already off by default :)

Clearly it has been a while since I've looked at this code base. :-)

@dhellmann
Copy link
Member

If you wanted to re-enable it....

Without digging too deeply it looks like you just need a list of strings?

curl -H"Accept: application/vnd.pypi.simple.v1+json" https://pypi.org/simple/ is supported way to get a JSON representation of all known project names on pypi.

Excellent, thank you for the tip, that'll help someone who wants to rewrite it (future me, or someone else who needs it).

@freakboy3742
Copy link

As I indicated in #213 - even if this can be rewritten in terms of a curl call, I'd suggest its a really bad idea to do so. There are 440k+ packages on PyPI, and every single one of them becomes a legal spelling word when this feature is turned on.

"Cute intentional misspellings of dictionary words" is a very common pattern of package naming - e.g., dropping the final vowel of flicker for flickr.

It won't pick up words that might be a violation of a project's style guide: namespace, phablet, or passthrough. This includes unilaterally accepting the US or UK spelling of any word that has a package by that name.

It won't pick up off-by-one typos: pypu instead of pypa or pypi

I'm sure an audit of the 440k package names on PyPA would reveal plenty of other "interesting" spellings.

And none of this takes into account the load that is put on the PyPA servers by downloading a 440k word list every time it rebuilds the spelling environment.

I'd strongly advocate for this entire feature being deprecated and removed.

If it is going to be retained, it should be reduced to the packages in the currently installed virtual environment, rather than the whole of PyPA (although I'd argue explicit inclusion in a dictionary is a much safer option).

@dhellmann
Copy link
Member

@freakboy3742 Thank you for the advice. I don't intend to do anything at all with this for now, as I tried to make clear in an earlier comment here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants