Code for the post: Testing Data and Machine Learning Pipelines (or the additive vs. retroactive impact of new data or logic on tests).
This is my attempt to clarify my thinking on testing data and machine learning pipelines. It discusses:
- Overview of testing scopes: unit, integration, functional, etc.
- An example pipeline: behavioral logs -> batch inference output
- Writing tests for our pipeline: unit, schema, integration
- Adding new data (visible impressions) to our pipeline
- The additive and retroactive impact of new data/logic on tests
- A suggested lean approach to testing data and ML pipelines