Associate model with data version #11

jcohenadad · 2021-04-23T16:28:00Z

context

once #9 is merged, the created model should refer to the git-annex data and mention the version.

how to do it?

git commit? tag? any other idea how we can do this @kousu?

once we have a plan, @alexfoias can you pls implement it, thx

kousu · 2021-04-25T07:19:39Z

I think you can just paste https://github.com/spine-generic/data-multi-subject/tree/r20201130 in a comment or README somewhere.

Unfortunately I don't know a good universal URL scheme for git repos with versions pinned. git accepts git clone https://whatever.com/repo.git, git clone git+ssh://whatever.com/repo, git clone [email protected]:repo.git, git clone git+https://whatever.com/repo`, but to specify a version you need to use -b and you can only give branches or tags.

To pin a more specific version, you have to use submodules; which is what datalad recommends. But my 3am flippant summary is that submodules are like everything confusing about Git's UI multiplied by 7. And anyway they're kind of an awkward fit here

python extended the URL formats to git+https://whatever.com/repo.git@version, and version can be a branch, tag, or arbitrary commit ID (and it looks like unity copied them too), but that only works under pip.

What about this: write a training script whose first step is git clone -b $PINNED_VERSION git+ssh://data.neuro.polymtl.ca/datasets/model_seg_exvivo_gm-wm_t2_unet2d-multichannel-softseg, and include that script as part of the model. If/when you update the model, first update the training script, and commit that change, before running it, and committing its results. It won't be reproducible by anyone outside the lab but at least it will be, you know, written down. I'd also suggest writing the training script to call script or tee to keep a log of the most recent model training, and committing that file along with it. This is a pretty similar workflow to what I did over in https://github.com/neuropoly/spinalcordtoolbox/blob/5c117ac349eef90528ee7be0edf42c21e31645f2/dev/docs/testimonials2rst come to think of it, except that script isn't smart enough to get its own source material.

jcohenadad mentioned this issue Jan 8, 2024

Identify where the dataset is #12

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Associate model with data version #11

Associate model with data version #11

jcohenadad commented Apr 23, 2021

kousu commented Apr 25, 2021

Associate model with data version #11

Associate model with data version #11

Comments

jcohenadad commented Apr 23, 2021

context

how to do it?

kousu commented Apr 25, 2021