Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git Restore Fails with "did not match any file(s) known to git" #1812

Open
hannahchiodo-msft opened this issue Apr 25, 2024 · 20 comments
Open

Comments

@hannahchiodo-msft
Copy link

Running git restore for a particular file appears to be not be compatible with GVFS. Repro below:

$commit = git log -1 --pretty=%H
git ls-tree -r --name-only $commit some/file.cpp
some/file.cpp
git restore --source=$commit some/file.cpp
error: pathspec some/file.cpp' did not match any file(s) known to git
@derrickstolee
Copy link
Contributor

Could you confirm if this is only an issue with git restore by checking if the same thing happens with this command?

git checkout $commit -- some/file.cpp

@hannahchiodo-msft
Copy link
Author

hannahchiodo-msft commented Apr 26, 2024

Yes, the same error occurs when using git checkout $commit -- some/file.cpp - so it looks like the issue is not specifically with the restore command.

@derrickstolee
Copy link
Contributor

There was a fix in https://github.com/microsoft/git/releases/tag/v2.45.2.vfs.0.0 that may be helpful to this case. Please check to see if your scenario is fixed when installing that version of Git.

@tiagomacarios
Copy link
Member

2.45.2.vfs.0.0 does not help. Also is looks like the issue only happens when the file has not been hydrated yet.

Z:\Office\src>git ls-tree -r --name-only 5a3397e220a26803de068db325e4ace38b63432a word/stringutils/number.cpp
word/stringutils/number.cpp
 
Z:\Office\src>git restore --source=5a3397e220a26803de068db325e4ace38b63432a word/stringutils/number.cpp
error: pathspec 'word/stringutils/number.cpp' did not match any file(s) known to git
 
Z:\Office\src>git --version
git version 2.45.2.vfs.0.0
 
Z:\Office\src>del word\stringutils\number.cpp
 
Z:\Office\src>git restore --source=5a3397e220a26803de068db325e4ace38b63432a word/stringutils/number.cpp
 
Z:\Office\src>

@jeffhostetler
Copy link

I was not able to duplicate the problem on my Mac with 2.45.2.vfs.0.0. I currently don't have a GVFS-enabled machine, so I can't really test the hydration theory. I do wonder if there is some skip-worktree-bit or sparse-index interaction.

@jeffhostetler
Copy link

I was able to duplicate the pathspec error using a sparse-checkout when the pathname was in the non-populated (sparse) portion of the worktree.

The skip-worktree-bit test in the following sets up the caller to throw the error message when containing loop is finished:

* thread #1, queue = 'com.apple.main-thread', stop reason = step over
    frame #0: 0x00000001000294b4 git`mark_ce_for_checkout_no_overlay(ce=0x0000000158004108, ps_matched="", opts=0x000000016fdfea18) at checkout.c:370:3
   367 	{
   368 		ce->ce_flags &= ~CE_MATCHED;
   369 		if (!opts->ignore_skipworktree && ce_skip_worktree(ce))
-> 370 			return;
   371 		if (ce_path_match(&the_index, ce, &opts->pathspec, ps_matched)) {
   372 			ce->ce_flags |= CE_MATCHED;
   373 			if (opts->source_tree && !(ce->ce_flags & CE_UPDATE))
Target 0: (git) stopped.
(lldb) bt all
* thread #1, queue = 'com.apple.main-thread', stop reason = step over
  * frame #0: 0x00000001000294b4 git`mark_ce_for_checkout_no_overlay(ce=0x0000000158004108, ps_matched="", opts=0x000000016fdfea18) at checkout.c:370:3
    frame #1: 0x0000000100028630 git`checkout_paths(opts=0x000000016fdfea18, new_branch_info=0x000000016fdfe978) at checkout.c:591:4
    frame #2: 0x0000000100026e9c git`checkout_main(argc=1, argv=0x000000016fdff548, prefix=0x0000000000000000, opts=0x000000016fdfea18, options=0x000000014f009200, usagestr=0x0000000100462418) at checkout.c:1938:9
    frame #3: 0x00000001000273f4 git`cmd_restore(argc=3, argv=0x000000016fdff548, prefix=0x0000000000000000) at checkout.c:2062:9
    frame #4: 0x00000001000034b0 git`run_builtin(p=0x00000001004789e0, argc=3, argv=0x000000016fdff548) at git.c:541:23
    frame #5: 0x0000000100001998 git`handle_builtin(argc=3, argv=0x000000016fdff548) at git.c:799:3
    frame #6: 0x0000000100002dbc git`run_argv(argcp=0x000000016fdff28c, argv=0x000000016fdff280) at git.c:868:4
    frame #7: 0x00000001000016dc git`cmd_main(argc=3, argv=0x000000016fdff548) at git.c:1008:19
    frame #8: 0x000000010012e530 git`main(argc=4, argv=0x000000016fdff540) at common-main.c:62:11
    frame #9: 0x000000019bf1a0e0 dyld`start + 2360

You should extend the sparse-checkout to include the pathname before trying to checkout/restore the content to the version from a specific commit:

% ls -l                                          
total 16
-rw-r--r--  1 jeff  staff  8 Jun 12 10:28 a
-rw-r--r--  1 jeff  staff  8 Jun 12 10:28 c

% ../../git status                               
On branch main
You are in a sparse checkout with 67% of tracked files present.

nothing to commit, working tree clean

% ../../git restore --source=HEAD b              
error: pathspec 'b' did not match any file(s) known to git

% ../../git sparse-checkout add /b               

% ls -la
total 24
drwxr-xr-x     6 jeff  staff    192 Jun 12 10:28 .
drwxr-xr-x  1154 jeff  staff  36928 Jun 12 10:28 ..
drwxr-xr-x    10 jeff  staff    320 Jun 12 10:28 .git
-rw-r--r--     1 jeff  staff      8 Jun 12 10:28 a
-rw-r--r--     1 jeff  staff      8 Jun 12 10:28 b
-rw-r--r--     1 jeff  staff      8 Jun 12 10:28 c

% ../../git restore --source=HEAD b

%

Please let us know if this does not address the problem that you are seeing. I only tested a simple non-cone example here and used a pathspec naming a single file. Your sparse-checkout usage may be more complex if you are using cone-mode or other tooling to control the populated region of the worktree, so you may need to adapt the suggestion here accordingly.

@derrickstolee
Copy link
Contributor

@jeffhostetler: this is a VFS for Git repo, so the user has no control over the sparse-checkout definition or the skip-worktree bits.

@jeffhostetler
Copy link

In Office?? I thought they were using Scalar. Anyway, doesn't GVFS set the skip-worktree bit for non-hydrated files? And won't that same piece of code cause the same error? Does the mount daemon clear the skip bit (and delete the CE for it) if it detects a delete on a non-hydrated file?

@jeffhostetler
Copy link

When you have it in a state where the pathspec error appears, could you run git ls-files -v --debug <filename> and see what it prints.
For example, do you see flags: 40004000 ?

% ../../git ls-files -v --debug b
S b
  ctime: 0:0
  mtime: 0:0
  dev: 0	ino: 0
  uid: 0	gid: 0
  size: 0	flags: 40004000

Then try the del <filename> trick again and (give it a second or two) and see if ls-files magically changes.

Then try the git restore and see if the ls-files changes again.

Thanks!

@derrickstolee
Copy link
Contributor

In Office?? I thought they were using Scalar.

Some Azure Pipelines workflows use GVFS clones to do Git operations in an automated scenario. I intend to learn more about what kinds of workflows are doing this so we can see if they can be converted to Scalar clones instead. But that's a much bigger dig than this bug.

@tiagomacarios
Copy link
Member

Z:\Office\src>git ls-files -v --debug b
S b
  ctime: 0:0
  mtime: 0:0
  dev: 0        ino: 0
  uid: 0        gid: 0
  size: 0       flags: 40104000

Z:\Office\src>del b

Z:\Office\src>git ls-files -v --debug b
H b
  ctime: 0:0
  mtime: 0:0
  dev: 0        ino: 0
  uid: 0        gid: 0
  size: 0       flags: 104000

Z:\Office\src>git restore b

Z:\Office\src>git ls-files -v --debug b
H b
  ctime: 1718214178:108675400
  mtime: 1718214178:194855200
  dev: 0        ino: 0
  uid: 0        gid: 0
  size: 775     flags: 100000

@jeffhostetler
Copy link

Nice! So the mount daemon magically detects the delete of the non-hydrated file and changes the bit flags, then the functioning restore updates the flags again. In both cases, the skip bit is not set (0x4000 0000). So the pathspec matching in the function I printed above is allowed to proceed.

So I think that the problem (or at least, what is causing this error) is that the skip-worktree shortcut is too aggressive. I don't know enough about that section of code yet, to say if it wrong or properly guarding the rest of the function (er, that is "works by design" or "the design is wrong").

Thanks.

@jeffhostetler
Copy link

So as a workaround, you might consider (but don't tell anyone that I suggested this):

% git update-index --no-skip-worktree b
% git restore b

That is, you're going to overwrite the file (whether it exists or not), so clear the skip-worktree bit (which will make Git think that it has been deleted) and then restore it using the SHA from the desired commit.

@tiagomacarios
Copy link
Member

Z:\Office\src>git update-index --no-skip-worktree b
fatal: modifying the skip worktree bit is not supported on a GVFS repo

@jeffhostetler
Copy link

Ack! I knew that was too easy....

I don't have a GVFS Windows machine, so I can't suggest anything further at this point.

Could you use a workaround of "del b ; git restore b" in your scripts for now?
That is, until we have a chance to think about a real fix.

@tiagomacarios
Copy link
Member

Yes. I checked in that workaround last week (after I discovered it), but now we are seeing other failures. It looks like if I try to checkout the file (post deletion) on a folder that only contains that file it fails with:

warning: unable to unlink '<folder>': Directory not empty
fatal: cannot create directory at '<folder>': Directory not empty

Something like:

git rm folder/b
git checkout <ref> -- folder/b
warning: unable to unlink '<folder>': Directory not empty
fatal: cannot create directory at '<folder>': Directory not empty

@jeffhostetler
Copy link

Humph.

I don't have a GVFS Windows machine, so I can't the GVFS mount daemon magic, but (conceptually) it is maintaining the .git/index in-sync with the virtual projection (and magically turning off the skip-worktree bit when the file is hydrated).
So you could try to force a hydration by simply opening the file for writing -- the kernel will pause your process and let
the mount daemon replace the magic reparse point with a real file (and most importantly) turn off the skip bit for the file
before it allows your process to have the file descriptor.

So, if deleting is too heavy a hammer, just try something like:

% cat < /dev/null >b
% git restore b

Again, the goal is just to force the hydration -- trick the daemon into thinking that you want to write to the file (like your
editor would).

FWIW, a simple read-only descriptor might not be sufficient, because there were plans to let the daemon project read-only
files directly from the object cache rather than actually writing a copy (kinda like a hard-link-with-copy-on-write on steroids).

Anyway, try running the debug ls-files before and after the cat </dev/null >b and see if the skip bits change.

@jeffhostetler
Copy link

You might try >> rather than > to get the append semantics rather than truncate to force the daemon to not cut any corners...

derrickstolee added a commit to derrickstolee/git that referenced this issue Jun 19, 2024
As documented in microsoft/VFSForGit#1812, attempting to restore a file
fails when using either of these commands in a VFS for Git repo:

  git restore --source=<commit> <path>
  git checkout <commit> -- <path>

To discover the issue, I debugged such a call and found that since
opt->ignore_skipworktree is not set, that the restore will not update
the index when the file is not hydrated.

I verified that this works as expected, including that the file on-disk
is projecting the new index version.

Signed-off-by: Derrick Stolee <[email protected]>
@derrickstolee
Copy link
Contributor

This should be fixed by microsoft/git#658 when it is merged and released.

derrickstolee added a commit to microsoft/git that referenced this issue Jun 19, 2024
As documented in microsoft/VFSForGit#1812, attempting to restore a file
fails when using either of these commands in a VFS for Git repo:

```
  git restore --source=<commit> <path>
  git checkout <commit> -- <path>
```

To discover the issue, I debugged such a call and found that since
`opt->ignore_skipworktree` is not set, that the restore will not update
the index when the file is not hydrated.

I verified that this works as expected, including that the file on-disk
is projecting the new index version.

----

* [X] This change only applies to the virtualization hook and VFS for
Git.
@derrickstolee
Copy link
Contributor

Hello, all! It turns out the change in microsoft/git#658 was problematic for users who specify a broad pathspec, such as in git restore .. (Yes, users could use git reset --hard instead.)

But this also presents a workaround for this case where only one file (or a small subset of files) is given by the pathspec: use git restore --ignore-skip-worktree --source=<commit> -- <path>. Upon reflection, this is a particularly tricky area so it is probably not worth attempting something more involved than this workaround.

dscho added a commit to microsoft/git that referenced this issue Jul 12, 2024
…store` (#674)

This reverts #658 and essentially reopens microsoft/VFSForGit#1812.
Resolves #673.

The issue is that by overriding the `ignore_skip_wortree` value
(normally set by `--ignore-skip-worktree`) we then force Git to hydrate
the entire working directory in a `git restore .`. This leads to some
interesting trade-offs:

1. Users on v2.45.2.vfs.0.2 (where `git restore .` overhydrates), they
can use `git reset --hard` as a better representation of this behavior.
2. Users on any Git version can use `git restore --ignore-skip-worktree
-- <pathspec>` when using a smaller-scale pathspec that corresponds to
files they care about and probably have hydrated.

I will include these workarounds in the VFS for Git issue.

To get to a real solution here, we would somehow need to turn the
`ignore_skip_worktree` bit into two bits:

a. A bit to say "update index entries that have `SKIP_WORKTREE` set".
b. A bit to say "update the working directory even if `SKIP_WORKTREE` is
set".

And perhaps this "working directory" version is what was intended when
adding the `--ignore-skip-worktree` option, but it's more involved and
has upstream implications to make changes here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants