-
-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace GitPython with pygit2 #2120
Conversation
f6dadb8
to
e90a067
Compare
Replace the use of GitPython package with pygit2. The latter seems to have better git support, in particular it supports the newer index versions 3 and 4. Since it is backed by the libgit2 library that is also used by Cargo, it seems to have the best chances of being updated for compatibility with new git versions. Admittedly, the API feels very low-level. In particular, it is necessary to explicitly request writing changes to index back, and explicitly reread it when it's modified externally (e.g. via another `pygit2.Repository` instance, as in tests). On the plus side, it does not invoke `git` at all -- everything is done by the library. Fixes conda-forge#2116
Remove the `search_parent_directories` kwarg that's never been used, and instead always enable searching parent directories for better cross-version pygit2 compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an API change and is not allowed under semantic versioning. Please put back this functionality.
This reverts commit a21d135.
Restored, and instead added backwards compatibility for |
@beckermr, another use is in In [10]: list(conda_smithy.feedstocks.feedstocks_repos(None, "/home/mgorny/git/conda"))
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
Cell In[10], line 1
----> 1 list(conda_smithy.feedstocks.feedstocks_repos(None, "/home/mgorny/git/conda"))
File ~/git/conda-smithy/conda_smithy/feedstocks.py:197, in feedstocks_repos(organization, feedstocks_directory, pull_up_to_date, randomise, regexp)
195 for feedstock in feedstocks:
196 repo = git.Repo(feedstock.directory)
--> 197 upstream = repo.remotes.upstream
199 if pull_up_to_date:
200 print("Fetching ", feedstock.package)
File ~/miniforge3/envs/conda-smithy/lib/python3.12/site-packages/git/util.py:1198, in IterableList.__getattr__(self, attr)
1196 return item
1197 # END for each item
-> 1198 return list.__getattribute__(self, attr)
AttributeError: 'IterableList' object has no attribute 'upstream' FWICS |
We need to migrate and fit. |
Ah, sorry — I was wrong, it wasn't broken. I've just realized it expected a remote called |
Another question: do we need SSH support for
Of course, another option is to call |
Anything that is currently supported in the code needs to be supported in this PR. We cannot have API changes like this. |
Is it okay to call |
Subprocesses are fine, but we should be mindful of how complex this might get. The point of this PR is to support new git index versions. IDK anything about those. Are these in use? How have we not hit bugs for this before new? |
Subprocesses will probably be less complex than doing everything via libgit2. Another option is to stick to GitPython for some of the code, at least until it actually breaks for someone. From what I understand, to hit this issue you need to use new enough git to clone the repository. It is also possible that the repository itself must have some characteristics that actually trigger the use of new index format. Maybe I was just unlucky that the first feedstock that I've cloned triggered this, or it is possible that more people will hit this as they upgrade their git to new versions and clone new feedstocks. |
I'd rather have one tool used and defer to subprocesses. Let's use a subprocess. |
I'm really sorry about that. Fixed now. Tests pass for me now with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs a news item.
Added. |
Co-authored-by: Matthew R. Becker <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this PR changes some pretty low-level functionality. The test suite uses mocks which is great, but we don't have tests of functioning code for all of the changes. We need to test these changes live for both staged-recipes and feedstock token handling before we can merge them.
I am happy to do the testing of these changes live, but it will be a bit before I can get to it. |
Not sure if you're asking me to do anything here. If you're asking whether I've tested them locally, I did test calling every function with some args suitable for local/semi-remote testing. |
Yep, nothing for you to do here. Sorry for blocking this one. Testing much of the code in smithy is hard since it runs against external services. :/ |
Yeah, I know. If I only had more time, I'd have added some more tests. |
Anything I can do here to help @beckermr? |
Right. We need to patch staged recipes to test out this branch in making a few new feedstocks. If those are working ok, then it seems ok to merge. |
I guess the patch goes here, right? Usual |
Yep. I'd put the |
Looks like it's working: |
This reverts commit 7de2dbb.
Thanks a lot! |
Checklist
news
entrypython conda_smithy/schema.py
)Replace the use of GitPython package with pygit2. The latter seems to have better git support, in particular it supports the newer index versions 3 and 4. Since it is backed by the libgit2 library that is also used by Cargo, it seems to have the best chances of being updated for compatibility with new git versions.
Admittedly, the API feels very low-level. In particular, it is necessary to explicitly request writing changes to index back, and explicitly reread it when it's modified externally (e.g. via another
pygit2.Repository
instance, as in tests). On the plus side, it does not invokegit
at all -- everything is done by the library.Fixes #2116
So far focused on
feedstock_io.py
and its tests. I need to figure out how to test the changes to other files properly, given that the tests mock the entiregit.Repo.clone_from
call.