-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: what to do with the "import name -> package name" mapping from conda-forge #92
Comments
@ocefpaf too! |
Good questions. The bot itself also has code that produces a "ranked hub authorities file" that is related to this. I don't understand that relationship. It might be good to flesh that out a bit maybe? |
ok so how does depfinder use these files? A. given Does this help @beckermr ? |
Helps a bit but files in libcfgraph are not made by the bot. So I think depfinder uses two services. |
oh. weird. ok. i guess cf-scripts only writes to cf-graph-countyfair? in that case, seems like the pypi_name_mapping github action produces import_name_priority_mapping.json that I could use instead the above file is a data structure that looks like this:
so what writes to libcfgraph then? oh there's a circleci action that updates libcfgraph i guess? what does libcfgraph do? |
Right there is a circleci action that writes to libcfgraph. libcfgraph collects info about every package into a single repo. It is used by a bunch of conda-forge stuff including the mamba solver for run exports and our scanning service to try and detect harmful files in packages. |
IDK if the import name priority mapping is complete or only covers nodes that are ambiguous. Also note that grayskull uses some of this data too. :/ |
As usual, the answer is to fix libcfgraph and just keep the status quo. We don't have the resources to pay down debt, but we can service it. |
@ericdill New import to pkg maps are appearing here: https://github.com/regro/libcfgraph/tree/master/import_to_pkg_maps These only have the package name and not the full artifact. They should be a lot smaller. |
@ericdill I'm a bit late for this discussion but, my opinion, is the same as before we added this to depfinder. It is a nice feature to have but I'd rather have it as a plugin/optional/separate module, etc than inside depfinder itself in order to reduce the maintenance burden here. |
Agreed. We should ship a package of simple apis for pulling this metadata. |
This has the nice side effect that if the data is moved to another device we can easily move everything over. |
oh that's a nice idea. would we make that new package part of the regro org? |
thanks @ocefpaf . i had forgotten the previous discussion. glad you recall! |
Sure. That's the best spot. Something like |
So grayskull doesn't pull from the bot data for these maps anymore. It maintains its own list of differences. They may have come from the bot at one time, but now it is separated. |
The data used by depfinder is now wrapped into this package: https://github.com/regro/conda-forge-metadata Here is how to use it
|
The reduced-size mapping is now erroring out too: regro/libcfgraph#14 :D |
Hi Team,
depfinder
has some code in reports.py that does a pretty good job mapping from "importable module" to "most likely package that has that module". Turns out that the code that enables this behavior in depfinder relies on a part of the bot that has been disabled for a little over a year. That part of the bot generates the files in libcfgraph/import_maps. The import map generation was disabled because it was generating json files that were over 100MB in size. And that was over a year ago. So that brings us to my question of what should we do about this?Bringing this functionality back into the bot is not something I'm not particularly keen to solve right now. It seems like a problem that's very well suited to "use a database for this", but since CF doesn't have access to databases, we're left having to do this with files and git.
If we don't have anyone interested in bringing this functionality back into the bot then my vote would be to disable this feature. We can reconsider bringing it back once the conda-forge bot is providing updated information.
What do you think @beckermr @CJ-Wright @mariusvniekerk?
What are the downsides to disabling this in depfinder?
The text was updated successfully, but these errors were encountered: