How does f11d deal with data version control? #618
-
BackgroundWhen dealing with small data, using git for CSV is just fine. However, if the dataset begins to grow, or is updated too frequently, the overhead quickly starts to add up. There have been many open source solutions for data version control, such as: QuestionShould the f11d tool set cater to integrate to any, or all of those? Is it possible to have a data workflow that does not only continuous data integration but also version control? Should we care? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
@augusto-herrmann a great question 👏 😄 I've spent quite a bit of time recently thinking about data "versioning" and version control and am actively working on it, see e.g. http://tech.datopian.com/versioning/ To start with could you share a bit more about the underlying user/job stories for this e.g. are you concerned about efficient storage, efficient syncing, diffing, patching etc etc. You may also want to look at the features list I've already got at http://tech.datopian.com/versioning/#features |
Beta Was this translation helpful? Give feedback.
@augusto-herrmann a great question 👏 😄
I've spent quite a bit of time recently thinking about data "versioning" and version control and am actively working on it, see e.g. http://tech.datopian.com/versioning/
To start with could you share a bit more about the underlying user/job stories for this e.g. are you concerned about efficient storage, efficient syncing, diffing, patching etc etc. You may also want to look at the features list I've already got at http://tech.datopian.com/versioning/#features