Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

git-annex: there is no available git remote named "amazon" #18

Closed
jcohenadad opened this issue Aug 27, 2020 · 20 comments
Closed

git-annex: there is no available git remote named "amazon" #18

jcohenadad opened this issue Aug 27, 2020 · 20 comments

Comments

@jcohenadad
Copy link
Member

jcohenadad commented Aug 27, 2020

@kousu i'm trying to push my changes with git annex, following the recommendations here, but I am getting the following error:

Full details
julien-macbook:~/code/spine-generic/data-multi-subject $ git checkout -b jca/17-defacing
Switched to a new branch 'jca/17-defacing'
julien-macbook:~/code/spine-generic/data-multi-subject $ git remote
amazon
origin
julien-macbook:~/code/spine-generic/data-multi-subject $ 
julien-macbook:~/code/spine-generic/data-multi-subject $ gs
 M sub-cmrra02/anat/sub-cmrra02_T2w.nii.gz
 M sub-cmrra04/anat/sub-cmrra04_T2w.nii.gz
 M sub-cmrra05/anat/sub-cmrra05_T2w.nii.gz
 M sub-oxfordFmrib07/anat/sub-oxfordFmrib07_T1w.nii.gz
 M sub-stanford06/anat/sub-stanford06_T1w.nii.gz
julien-macbook:~/code/spine-generic/data-multi-subject $ gc
(recording state in git...)
[jca/17-defacing 4ea269a7] Fixed defacing in a few subjects
 5 files changed, 5 insertions(+), 5 deletions(-)
julien-macbook:~/code/spine-generic/data-multi-subject $ git annex sync --content amazon
git-annex: there is no available git remote named "amazon"
julien-macbook:~/code/spine-generic/data-multi-subject $ git remote
amazon
origin
julien-macbook:~/code/spine-generic/data-multi-subject $ 
@jcohenadad
Copy link
Member Author

However, when running the command without "--content amazon", it seems to work:

julien-macbook:~/code/spine-generic/data-multi-subject $ git annex sync
commit 
On branch jca/17-defacing
nothing to commit, working tree clean
ok
pull origin 
ok
push origin 
Enumerating objects: 59, done.
Counting objects: 100% (59/59), done.
Delta compression using up to 16 threads
Compressing objects: 100% (37/37), done.
Writing objects: 100% (42/42), 4.15 KiB | 2.08 MiB/s, done.
Total 42 (delta 11), reused 0 (delta 0)
remote: Resolving deltas: 100% (11/11), completed with 7 local objects.
To https://github.com/spine-generic/data-multi-subject.git
   bd59527b..7c13697d  git-annex -> synced/git-annex
 * [new branch]        jca/17-defacing -> synced/jca/17-defacing
ok

@jcohenadad
Copy link
Member Author

Note (to be added to the doc), the bottom one needs to be selected:

image

@Drulex
Copy link

Drulex commented Aug 31, 2020

However, when running the command without "--content amazon", it seems to work:

This does not upload the files to the remote. You need to run git annex sync --content to upload the actual files. sync only synchronizes the git repositories, not the data. (ref https://git-annex.branchable.com/sync/)

@Drulex
Copy link

Drulex commented Aug 31, 2020

Try without specifying the remote amazon maybe? I'm guessing it has to do with the version of git annex. What version are you running?

@jcohenadad
Copy link
Member Author

8.2

i did try without specifying amazon, but it didn’t upload either

@Drulex
Copy link

Drulex commented Aug 31, 2020

output of git annex list?

$  git annex list
here
|origin
||amazon
|||web
||||bittorrent
|||||
__X__ derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual.nii.gz
__X__ derivatives/labels/sub-amu04/anat/sub-amu04_T1w_RPI_r_labels-manual.nii.gz
__X__ derivatives/labels/sub-amu05/anat/sub-amu05_T1w_RPI_r_labels-manual.nii.gz
__X__ derivatives/labels/sub-balgrist01/anat/sub-balgrist01_acq-T1w_MTS_seg-manual.nii.gz
__X__ derivatives/labels/sub-balgrist02/anat/sub-balgrist02_acq-T1w_MTS_seg-manual.nii.gz
__X__ derivatives/labels/sub-barcelona05/anat/sub-barcelona05_T2w_csfseg-manual.nii.gz
__X__ derivatives/labels/sub-beijingGE03/anat/sub-beijingGE03_T1w_RPI_r_seg-manual.nii.gz
__X__ derivatives/labels/sub-beijingGE04/anat/sub-beijingGE04_T1w_RPI_r_seg-manual.nii.gz
__X__ derivatives/labels/sub-beijingPrisma01/anat/sub-beijingPrisma01_acq-T1w_MTS_seg-manual.nii.gz
__X__ derivatives/labels/sub-beijingVerio02/anat/sub-beijingVerio02_T2w_RPI_r_seg-manual.nii.gz
__X__ derivatives/labels/sub-beijingVerio02/dwi/sub-beijingVerio02_dwi_concat_moco_dwi_mean_seg-manual.nii.gz
[...]

@Drulex
Copy link

Drulex commented Aug 31, 2020

Maybe also try to enable it (it should be enabled by default, but who knows).

$  git annex enableremote amazon
enableremote amazon ok
(recording state in git...)

@jcohenadad
Copy link
Member Author

Maybe also try to enable it (it should be enabled by default, but who knows).

$  git annex enableremote amazon
enableremote amazon ok
(recording state in git...)

that's been my nightmare command for the past 7 days (#33)

git annex enableremote amazon
enableremote amazon 
git-annex: Unknown remote type S3 (pick from: git gcrypt p2p bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external)
failed
git-annex: enableremote: 1 failed

@jcohenadad
Copy link
Member Author

and git-annex list:

git-annex list
here
|origin
||web
|||bittorrent
||||
X___ derivatives/labels/sub-amu01/anat/sub-amu01_T1w_labels-disc-manual.nii.gz
____ derivatives/labels/sub-amu01/anat/sub-amu01_T2w_labels-disc-manual.nii.gz
X___ derivatives/labels/sub-amu02/anat/sub-amu02_T1w_RPI_r_labels-manual.nii.gz
____ derivatives/labels/sub-amu02/anat/sub-amu02_T1w_labels-disc-manual.nii.gz
____ derivatives/labels/sub-amu02/anat/sub-amu02_T2w_labels-disc-manual.nii.gz
____ derivatives/labels/sub-amu03/anat/sub-amu03_T1w_labels-disc-manual.nii.gz
____ derivatives/labels/sub-amu03/anat/sub-amu03_T2w_labels-disc-manual.nii.gz
X___ derivatives/labels/sub-amu04/anat/sub-amu04_T1w_RPI_r_labels-manual.nii.gz
____ derivatives/labels/sub-amu04/anat/sub-amu04_T1w_labels-disc-manual.nii.gz
____ derivatives/labels/sub-amu04/anat/sub-amu04_T2w_labels-disc-manual.nii.gz
X___ derivatives/labels/sub-amu05/anat/sub-amu05_T1w_RPI_r_labels-manual.nii.gz
____ derivatives/labels/sub-amu05/anat/sub-amu05_T1w_labels-disc-manual.nii.gz
…

@jcohenadad
Copy link
Member Author

i'm beginning to think that there might be something wrong with my installed version of git-annex:

git-annex version
git-annex version: 8.20200810
build flags: Assistant Webapp Pairing WebDAV FsEvents TorrentParser MagicMime Feeds Testsuite
dependency versions: bloomfilter-2.0.1.0 cryptonite-0.27 DAV-1.3.4 feed-1.3.0.1 ghc-8.6.5 http-client-0.7.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external
operating system: darwin x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8

@Drulex what does your output says at the line "remote types"? does yours include "amazon"

@Drulex
Copy link

Drulex commented Aug 31, 2020

Sounds like a git annex/build problem. I stumbled upon this https://git-annex.branchable.com/forum/Unknown_remote_type_S3/ (although very old, but maybe still relevant?).

$  git annex version
git-annex version: 8.20200810-g3400b0188
build flags: Assistant Webapp Pairing S3 WebDAV Inotify DBus DesktopNotify TorrentParser MagicMime Feeds Testsuite
dependency versions: aws-0.22 bloomfilter-2.0.1.0 cryptonite-0.27 DAV-1.3.4 feed-1.3.0.1 ghc-8.10.1 http-client-0.7.1 persistent-sqlite-2.10.6.2 torrent-10000.1.1 uuid-1.3.13 yesod-1.6.1.0
key/value backends: SHA256E SHA256 SHA512E SHA512 SHA224E SHA224 SHA384E SHA384 SHA3_256E SHA3_256 SHA3_512E SHA3_512 SHA3_224E SHA3_224 SHA3_384E SHA3_384 SKEIN256E SKEIN256 SKEIN512E SKEIN512 BLAKE2B256E BLAKE2B256 BLAKE2B512E BLAKE2B512 BLAKE2B160E BLAKE2B160 BLAKE2B224E BLAKE2B224 BLAKE2B384E BLAKE2B384 BLAKE2BP512E BLAKE2BP512 BLAKE2S256E BLAKE2S256 BLAKE2S160E BLAKE2S160 BLAKE2S224E BLAKE2S224 BLAKE2SP256E BLAKE2SP256 BLAKE2SP224E BLAKE2SP224 SHA1E SHA1 MD5E MD5 WORM URL X*
remote types: git gcrypt p2p S3 bup directory rsync web bittorrent webdav adb tahoe glacier ddar git-lfs hook external
operating system: linux x86_64
supported repository versions: 8
upgrade supported from repository versions: 0 1 2 3 4 5 6 7
local repository version: 8

@Drulex what does your output says at the line "remote types"? does yours include "amazon"

It does.

@jcohenadad
Copy link
Member Author

wow, this is crazy. Does it mean that the "brew" version of git-annex 8 for macOS does not include the same features as the linux version you are using? did you install it via conda? If so, and if this is indeed the problem, that's annoying because conda version of git-annex 8 does not cover macOS (AFAIK)

@Drulex
Copy link

Drulex commented Aug 31, 2020

I'm starting to like the built-in LFS feature of Github a lot more..

@Drulex
Copy link

Drulex commented Aug 31, 2020

Does it mean that the "brew" version of git-annex 8 for macOS does not include the same features as the linux version you are using

The program can be built with different options, S3 remote support being on of them. I you use the pre-built binary from your package manager you would need to check with what options it was built (evidently it's missing S3).

If you are brave enough you can build from source https://git-annex.branchable.com/install/fromsource/, but I wouldn't consider this acceptable for a user who just wants to download the files.

I guess I am lucky that my package manager (Arch based Linux) included the build with S3 support.

@jcohenadad
Copy link
Member Author

but I wouldn't consider this acceptable for a user who just wants to download the files.

100% agreed

@kousu
Copy link
Contributor

kousu commented Sep 1, 2020

😢

wow, this is crazy. Does it mean that the "brew" version of git-annex 8 for macOS does not include the same features as the linux version you are using? did you install it via conda? If so, and if this is indeed the problem, that's annoying because conda version of git-annex 8 does not cover macOS (AFAIK)

git annex claims to be future-proofed (https://git-annex.branchable.com/future_proofing/) but that only seems to be true if you are a single person using multiple systems that are all running the same OS. It's not designed for with pull requests or branches or collaboration in mind.

datalad is attempting to corral all this, adding helpful auto-enable flags and settings, but IMO they are just adding another layer to the problem. Instead of vendoring git-annex they just say it's up to the user to get git-annex installed, so all of these incompatibility issues will transfer there, and you need to worry about using compatible versions of datalad, too.

I'm starting to like the built-in LFS feature of Github a lot more..

Yeah, this is kind of what I've been saying from the start. The design of Git-LFS is intrinsically simpler and more stable.

But I think we should keep on with git-annex for now because it is what other neuroimaging teams are using. And just so we can get some real world experience trying to share these datasets.

I think it would be useful to document all the problems we run into with it. I have some notes started but they're mostly just lots of question marks and confused log samples; a wiki where we had the issues organized would be helpful to raise with the larger neuroimaging community.


Another solution we could reconsider is running our own git server. I think git-annex is probably a lot more stable using ssh remotes rather than any of the "special" remotes since that's its core use-case. The downside is that it's more expensive (~5x when we priced it out) and there's no way to put a CDN in front of it (so we can't reduce the cost even more). Or, we could run our own git server with our own LFS server aside.


By the way, I found https://github.com/meltingice/git-lfs-s3, which lets you use S3 via the LFS API, sidestepping the annex API. So then we'd sidestep Github's expensive data transfer pricing but still be able to use Github. I think. We'd have to set it up. But it itself might be unmaintained and hard to install reliably, so maybe it's no better, and that won't be compatible with datalad. 🤷.

@Drulex
Copy link

Drulex commented Sep 1, 2020

Another solution we could reconsider is running our own git server. I think git-annex is probably a lot more stable using ssh remotes rather than any of the "special" remotes since that's its core use-case.

SSH remote is great for contributors, but what would be the preferred configuration to have public reads and private writes for a filesystem remote on a VPS? I can't find any info online.

Can we serve a read-only repo via git or https protocol and read/write via ssh?

(looked back at the spreadsheet and we can get a vps at OVH with unlimited traffic and 660GB storage for 75CAD/mo)

@kousu
Copy link
Contributor

kousu commented Sep 1, 2020

I thought a little about No. We can't.I

Another solution we could reconsider is running our own git server. I think git-annex is probably a lot more stable using ssh remotes rather than any of the "special" remotes since that's its core use-case.

SSH remote is great for contributors, but what would be the preferred configuration to have public reads and private writes for a filesystem remote on a VPS? I can't find any info online.

Can we serve a read-only repo via git or https protocol and read/write via ssh?

Of duh, of course. Yes, that's a major stumbling block.

I thought a little bit about this. The problem is that git-annex's ssh remote is implemented as an ssh command (much like rsync), so it only works over git+ssh://. It doesn't have a git+https:// implementation. I don't know enough about git to know how hard that would be to write.

Another way would be to set up a public user, the way anonymous ftp:// or anonymous xmpp work. gitolite has an ALL group that covers anyone not otherwise specified; we would have to fiddle with /etc/sudoers and maybe /etc/ssh/sshd_config to get this to work but I think it's doable; I don't know if an equivalent is possible with gitea. It would be unusual but it worked for many, many years for ftp:// and can work again :)

@kousu
Copy link
Contributor

kousu commented Sep 1, 2020

I have a long-term blue-sky solution: fork git itself and add the hook in there. I am thinking it should be implemented as git config core.lazy; the way it would work is, when enabled:

git fetch

downloads all the trees, branches and maybe tags but skips objects. Then

git checkout

would download the objects, but only those objects needed for.

Since git clone ~= git fetch && git checkout, git clone effectively becomes git clone --depth 1 by default. But unlike using the --depth flag, you can at any time switch to another branch and have everything work properly.

Add a git fetch --complete or --all-objects for systems where you do want a full copy.

i.e. my proposal is to take the algorithm git-annex and git-lfs both glue awkwardly onto the side of git and push it down into git itself. Throw away the headache of extra servers and the unnecessary features like git-annex assistant and git-annex webapp and git-annex unlock and encryption.

We can resurrect the other features, like cost-effective global distribution, by porting the git annex special remote code to become git special remotes. It's perfectly possible to do git clone s3://data-multi-subject.neuropoly.ca-central-1.s3.amazonaws.com, or any other protocol:// we care. I have a bookmark somewhere showing how we could do that.

@kousu
Copy link
Contributor

kousu commented Sep 5, 2020

This is Homebrew/homebrew-core#60505. I've patched it. Just waiting for homebrew to accept the patch.

@kousu kousu closed this as completed Sep 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants