Check compatibility of license of each entry with the original dataset license #9

tvercaut · 2018-10-14T14:08:46Z

As per #1 CC-BY is chosen as the default licence for the model zoo entries. However, this might not be compatible with the licence of the training dataset that was used to compute the weights.

OASIS for example has a permissive CC-BY licence (https://www.oasis-brains.org/#access) but has additional citation requirements which are currently not quite met in https://github.com/NifTK/NiftyNetModelZoo/tree/5-reorganising-with-lfs/OASIS

We need to check each entry individually.

What does the BRATS license say?
The VISCERAL paper mentions a "license agreement that assured the use of the data in its given environment and for its research purpose". We currently do not mention a non-commercial restriction
etc.

wyli · 2018-10-16T09:11:13Z

For OASIS there's an additional license file included in the .tar.gz;
for BRATS, it's a few volume extracted from the original set, I have contacted Spyros, he agreed that we host these volumes with a citation to the original papers.
I'll double check the other downloadables...

tvercaut · 2018-10-16T10:33:02Z

Thanks. Note that it's not only about the data but also about the pre-trained weights as these might be considered derived work. Not 100% sure about it but would be worth looking into.

Re OASIS, for clarity, we could copy (or point to) the OASIS licence in a README file (in line with the discussion in #6 )

fepegar · 2018-10-16T10:38:49Z

@tvercaut, do you have any reference that explains what licenses are needed for machine learning models?

tvercaut · 2018-10-16T17:13:48Z

That is a complex question and in many cases might depend on the licences under which the training data was released. You will need someone with an actual law background to help navigate these questions I am afraid.

Even when the training data consists of photographs from say imagenet, flickr, etc. there are copyright questions. Whether pre-trained weights from there fall under "fair use" (not convinced but see see e.g. https://fairuse.stanford.edu/overview/fair-use/what-is-fair-use/) or whether they fall under "databases/fact compilations" (never really looked into these) or whether I am just fantasising (very plausible but I don't think this has been tested in court yet) is a great question. You will find many reddit and similar discussions on the topic, e.g.:

In short, we won't have a clear cut answer unless the licence in the original dataset helps us out...

fepegar · 2018-10-16T17:15:55Z

Thanks, Tom! I'll take a look.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check compatibility of license of each entry with the original dataset license #9

Check compatibility of license of each entry with the original dataset license #9

tvercaut commented Oct 14, 2018

wyli commented Oct 16, 2018

tvercaut commented Oct 16, 2018

fepegar commented Oct 16, 2018

tvercaut commented Oct 16, 2018 •

edited

Loading

fepegar commented Oct 16, 2018

Check compatibility of license of each entry with the original dataset license #9

Check compatibility of license of each entry with the original dataset license #9

Comments

tvercaut commented Oct 14, 2018

wyli commented Oct 16, 2018

tvercaut commented Oct 16, 2018

fepegar commented Oct 16, 2018

tvercaut commented Oct 16, 2018 • edited Loading

fepegar commented Oct 16, 2018

tvercaut commented Oct 16, 2018 •

edited

Loading