Dataset Overwrite/Versioning System #7

bbengfort · 2015-10-27T18:44:42Z

Right now if you upload a duplicate file, the file is modified on S3 - e.g. its "last modified" timestamp changes. We need to ask some important questions for data management:

Are we simply "touching" the file or are we rewriting it?
What counts as a duplicate on S3? Presumably just the filename, or are we protected by the hash?
Can we use some temporary data store in S3 that gets cleaned regularly for protection?
Should we save datasets according to their hash, then rename on download?

We should make sure that a dataset cannot be overridden if someone uploads a different dataset with the same name.

bbengfort · 2016-01-27T21:14:46Z

At this point we now have a new "versioning" scheme, wherein every dataset has its own unique version that you can download and go back in time to see. This is definitely a more advanced usage, and related to this issue; but further thought is going to be required. As such, I'm moving this issue back into the backlog.

rebeccabilbro · 2016-12-10T00:05:14Z

This will be resolved by #59

bbengfort added type: technical debt priority: medium labels Oct 27, 2015

bbengfort added this to the Version 0.2 milestone Oct 27, 2015

bbengfort mentioned this issue Oct 27, 2015

Async Upload with Celery #8

Open

bbengfort added ready priority: high and removed priority: medium labels Oct 27, 2015

bbengfort self-assigned this Jan 8, 2016

bbengfort removed this from the Version 0.2 milestone Jan 27, 2016

bbengfort removed the ready label Jan 27, 2016

bbengfort added this to the Version 0.3 milestone Jul 9, 2016

bbengfort changed the title ~~Dataset Overwrite~~ Dataset Overwrite/Versioning System Jul 9, 2016

bbengfort modified the milestones: Version 0.4 , Version 0.3 Jul 9, 2016

rebeccabilbro added the Advanced label Dec 10, 2016

rebeccabilbro assigned looselycoupled and unassigned bbengfort Dec 10, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dataset Overwrite/Versioning System #7

Dataset Overwrite/Versioning System #7

bbengfort commented Oct 27, 2015

bbengfort commented Jan 27, 2016

rebeccabilbro commented Dec 10, 2016

Dataset Overwrite/Versioning System #7

Dataset Overwrite/Versioning System #7

Comments

bbengfort commented Oct 27, 2015

bbengfort commented Jan 27, 2016

rebeccabilbro commented Dec 10, 2016