Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propose users to provide their own data availability resources to be registered within DataLad dandisets #274

Open
yarikoptic opened this issue Oct 3, 2022 · 0 comments

Comments

@yarikoptic
Copy link
Member

In principle could be done via ContentUrls at the level of dandischema and thus centralized at the archive level.
But first the idea came up in context of datalad, might be easier to "implement" via datalad which has hooks into various other data portals. It was inspired by questions about data longevity we received today from ncsu.edu .

We could allow people to report availability of data in form of

  • dictionary/list of sha256 checksum: URL
  • URL to any "git remote" compatible git/git-annex remote (e.g. on gin, or on any git-annex special remote via git datalad remote)

then we could easily register those within datalad dandisets we have. In principle some of that information could even be reflected in metadata records for assets (if direct urls) or dandisets (full remotes, so individual urls might be tricky/impossible).

  • not quite "IPFS" but not intended to be ;)
  • might be trickier/too cumbersome for collections of zarr dandisets but still doable

WDYT @satra ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant