This repository has been archived by the owner on Oct 16, 2024. It is now read-only.
GTO: Document SemVer practices for ML models #231
Labels
A: docs
Area: user documentation (gatsby-theme-iterative)
p2-nice-to-have
Less of a priority at the moment. We don't usually deal with this immediately.
type: discussion
Semantic versioning is the accepted way to version code. How should artifacts be versioned?
I have been asked this by a Data Scientist some time ago. Given that everyone is free to do whatever he wants, perhaps giving a hint is not bad...?
I formulated a reasonable convention for models, not sure if it could be of any use:
Patch
Model as a black-box is as before, it only outputs different numbers.
Typical scenario: model have been trained with more recent data
Typical scenario 2: changed hyper-parameters
Minor
May want to take advantage of additional outputs or additional functionalities
Typical scenario 1: model now has
predict_proba()
in addition topredict()
Typical scenario 2: model now outputs a json with an additional field
confidence_interval
, in addition topredicted_values
Major
Need to re-visit the code that calls the model to serve it (breaking change)
Typical scenario 1: model APIs have changed
Typical scenario 2: model expects different input data format
Typical scenario 3: model relies on different libraries, need to re-build the venv (or even the OS-level libraries)
Originally posted by @francesco086 in #199 (comment)
🧵 See the thread for more opinions on this
The text was updated successfully, but these errors were encountered: