Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset Overwrite/Versioning System #7

Open
bbengfort opened this issue Oct 27, 2015 · 2 comments
Open

Dataset Overwrite/Versioning System #7

bbengfort opened this issue Oct 27, 2015 · 2 comments

Comments

@bbengfort
Copy link
Member

Right now if you upload a duplicate file, the file is modified on S3 - e.g. its "last modified" timestamp changes. We need to ask some important questions for data management:

  1. Are we simply "touching" the file or are we rewriting it?
  2. What counts as a duplicate on S3? Presumably just the filename, or are we protected by the hash?
  3. Can we use some temporary data store in S3 that gets cleaned regularly for protection?
  4. Should we save datasets according to their hash, then rename on download?

We should make sure that a dataset cannot be overridden if someone uploads a different dataset with the same name.

@bbengfort
Copy link
Member Author

At this point we now have a new "versioning" scheme, wherein every dataset has its own unique version that you can download and go back in time to see. This is definitely a more advanced usage, and related to this issue; but further thought is going to be required. As such, I'm moving this issue back into the backlog.

@bbengfort bbengfort removed this from the Version 0.2 milestone Jan 27, 2016
@bbengfort bbengfort removed the ready label Jan 27, 2016
@bbengfort bbengfort added this to the Version 0.3 milestone Jul 9, 2016
@bbengfort bbengfort changed the title Dataset Overwrite Dataset Overwrite/Versioning System Jul 9, 2016
@bbengfort bbengfort modified the milestones: Version 0.4 , Version 0.3 Jul 9, 2016
@rebeccabilbro
Copy link
Member

This will be resolved by #59

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants