Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module 2 updates #8

Merged
merged 44 commits into from
Oct 16, 2023
Merged
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
b316fc3
Delete docs/images/2023109_course_module1_fin_images.005.png
dmaliugina Oct 15, 2023
486d9ce
Delete docs/images/2023109_course_module1_fin_images.008.png
dmaliugina Oct 15, 2023
36841ae
Delete docs/images/2023109_course_module1_fin_images.011.png
dmaliugina Oct 15, 2023
59c6385
Delete docs/images/2023109_course_module1_fin_images.012.png
dmaliugina Oct 15, 2023
e758fe1
Delete docs/images/2023109_course_module1_fin_images.015.png
dmaliugina Oct 15, 2023
d991327
Delete docs/images/2023109_course_module1_fin_images.016.png
dmaliugina Oct 15, 2023
4efd60c
Delete docs/images/2023109_course_module1_fin_images.024.png
dmaliugina Oct 15, 2023
a4ea5ca
Delete docs/images/2023109_course_module1_fin_images.030.png
dmaliugina Oct 15, 2023
bd84768
Delete docs/images/2023109_course_module1_fin_images.031.png
dmaliugina Oct 15, 2023
0819ebe
Delete docs/images/2023109_course_module1_fin_images.034.png
dmaliugina Oct 15, 2023
f24be34
Delete docs/images/2023109_course_module1_fin_images.050.png
dmaliugina Oct 15, 2023
9c0fe9a
Delete docs/images/2023109_course_module1_fin_images.052.png
dmaliugina Oct 15, 2023
bb5efd7
Delete docs/images/2023109_course_module1_fin_images.061.png
dmaliugina Oct 15, 2023
70b39d9
Delete docs/images/2023109_course_module1_fin_images.065.png
dmaliugina Oct 15, 2023
395d8cd
Delete docs/images/2023109_course_module1_fin_images.066.png
dmaliugina Oct 15, 2023
8e49a9a
Delete docs/images/2023109_course_module2.005.png
dmaliugina Oct 15, 2023
a8cf336
Delete docs/images/2023109_course_module2.006.png
dmaliugina Oct 15, 2023
dddde9c
Delete docs/images/2023109_course_module2.007.png
dmaliugina Oct 15, 2023
c4564d8
Delete docs/images/2023109_course_module2.009.png
dmaliugina Oct 15, 2023
857d067
Delete docs/images/2023109_course_module2.012.png
dmaliugina Oct 15, 2023
b680154
Delete docs/images/2023109_course_module2.016.png
dmaliugina Oct 15, 2023
cb130c9
Delete docs/images/2023109_course_module2.020.png
dmaliugina Oct 15, 2023
2b6c1c4
Delete docs/images/2023109_course_module2.025.png
dmaliugina Oct 15, 2023
f6a1dc4
Delete docs/images/2023109_course_module2.028.png
dmaliugina Oct 15, 2023
5bf986a
Delete docs/images/2023109_course_module2.041.png
dmaliugina Oct 15, 2023
22f1db3
Delete docs/images/2023109_course_module2.047.png
dmaliugina Oct 15, 2023
73aa5d7
Delete docs/images/2023109_course_module2.058.png
dmaliugina Oct 15, 2023
c0d22dc
Delete docs/images/2023109_course_module2.060.png
dmaliugina Oct 15, 2023
dbbaa55
Delete docs/images/2023109_course_module2.065.png
dmaliugina Oct 15, 2023
5191093
Delete docs/images/2023109_course_module2.070.png
dmaliugina Oct 15, 2023
0b03f4e
Delete docs/images/2023109_course_module2.071.png
dmaliugina Oct 15, 2023
3b6e1ea
Added compressed images for Module 1 and 2
dmaliugina Oct 15, 2023
2128eca
Update ml-lifecycle.md
dmaliugina Oct 15, 2023
95064df
Update ml-monitoring-architectures.md
dmaliugina Oct 15, 2023
c22e45e
Update ml-monitoring-metrics.md
dmaliugina Oct 15, 2023
2a4b31a
Update ml-monitoring-observability.md
dmaliugina Oct 15, 2023
6d1db65
Update ml-monitoring-setup.md
dmaliugina Oct 15, 2023
4e8a9a9
Update data-prediction-drift-in-ml.md
dmaliugina Oct 15, 2023
c0eb4be
Update data-quality-in-ml.md
dmaliugina Oct 15, 2023
50531d3
Update evaluate-ml-model-quality.md
dmaliugina Oct 15, 2023
c936c1f
Update ml-quality-metrics-classification-regression-ranking.md
dmaliugina Oct 15, 2023
100d39c
Create data-drift-deep-dive.md
dmaliugina Oct 15, 2023
1e7cc8a
Update SUMMARY.md
dmaliugina Oct 15, 2023
ff37725
Update README.md
dmaliugina Oct 15, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 9 additions & 5 deletions docs/book/README.md
Original file line number Diff line number Diff line change
@@ -12,15 +12,19 @@ Welcome to the Open-source ML observability course!
The course starts on **October 16, 2023**. \
[Sign up](https://www.evidentlyai.com/ml-observability-course) to save your seat and receive weekly course updates.

# How to participate?
* **Join the course**. [Sign up](https://www.evidentlyai.com/ml-observability-course) to receive weekly updates with course materials and information about office hours.
* **Course platform [OPTIONAL]**. If you want to receive a course certificate, you should **also** [register](https://evidentlyai.thinkific.com/courses/ml-observability-course) on the platform and complete all the assignments before **December 1, 2023**.

The course starts on **October 16, 2023**. The videos and course notes for the new modules will be released during the course cohort.

# Links
* **Newsletter**. [Sign up](https://www.evidentlyai.com/ml-observability-course) to receive weekly updates with the course materials.

* **Discord community**. Join the [community](https://discord.gg/PyAJuUD5mB) to ask questions and chat with others.
* **Course platform**. [Register](https://evidentlyai.thinkific.com/courses/ml-observability-course) if you want to submit assignments and receive the certificate. This is optional.
* **Code examples**. Will be published in this GitHub [repository](https://github.com/evidentlyai/ml_observability_course) throughout the course.
* **Enjoying the course?** [Star](https://github.com/evidentlyai/evidently) Evidently on GitHub to contribute back! This helps us create free, open-source tools and content for the community.

* **YouTube playlist**. [Subscribe](https://www.youtube.com/playlist?list=PL9omX6impEuOpTezeRF-M04BW3VfnPBRF) to the course YouTube playlist to keep tabs on video updates.

The course starts on **October 16, 2023**. The videos and course notes for the new modules will be released during the course cohort.
**Enjoying the course?** [Star](https://github.com/evidentlyai/evidently) Evidently on GitHub to contribute back! This helps us create free, open-source tools and content for the community.

# What the course is about
This course is a deep dive into ML model observability and monitoring.
1 change: 1 addition & 0 deletions docs/book/SUMMARY.md
Original file line number Diff line number Diff line change
@@ -17,6 +17,7 @@
* [2.4. Data quality in machine learning](ml-observability-course/module-2-ml-monitoring-metrics/data-quality-in-ml.md)
* [2.5. Data quality in ML [CODE PRACTICE]](ml-observability-course/module-2-ml-monitoring-metrics/data-quality-code-practice.md)
* [2.6. Data and prediction drift in ML](ml-observability-course/module-2-ml-monitoring-metrics/data-prediction-drift-in-ml.md)
* [2.7. Deep dive into data drift detection [OPTIONAL]](ml-observability-course/module-2-ml-monitoring-metrics/data-drift-deep-dive.md)
* [2.8. Data and prediction drift in ML [CODE PRACTICE]](ml-observability-course/module-2-ml-monitoring-metrics/data-prediction-drift-code-practice.md)
* [Module 3: ML monitoring for unstructured data](ml-observability-course/module-3-ml-monitoring-for-unstructured-data.md)
* [Module 4: Designing effective ML monitoring](ml-observability-course/module-4-designing-effective-ml-monitoring.md)
Original file line number Diff line number Diff line change
@@ -17,11 +17,11 @@ You can perform different types of evaluations at each of these stages. For exam
* During data preparation, exploratory data analysis (EDA) helps to understand the dataset and validate the problem statement.
* At the experiment stage, performing cross-validation and holdout testing helps validate and test if ML models are useful.

![](<../../../images/2023109\_course\_module1\_fin\_images.005.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.005-min.png>)

However, the work does not stop here! Once the best model is deployed to production and starts bringing business value, every erroneous prediction has its costs. It is crucial to ensure that this model functions stably and reliably. To do that, one must continuously monitor the production ML model and data.

![](<../../../images/2023109\_course\_module1\_fin\_images.008.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.008-min.png>)

## What can go wrong in production?

@@ -34,21 +34,21 @@ Many things can go wrong once you deploy an ML model to the real world. Here are
* Data schema changes in the upstream system, third-party APIs, or catalogs.
* Data loss at source when dealing with broken sensors, logging errors, database outages, etc.

![](<../../../images/2023109\_course\_module1\_fin\_images.011.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.011-min.png>)

**Broken upstream model**. Often, not one model but a chain of ML models operates in production. If one model gives wrong outputs, it can affect downstream models.

![](<../../../images/2023109\_course\_module1\_fin\_images.012.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.012-min.png>)

**Concept drift**. Gradual concept drift occurs when the target function continuously changes over time, leading to model degradation. If the change is sudden – like the recent pandemic – you’re dealing with sudden concept drift.

**Data drift**. Distribution changes in the input features may signal data drift and potentially cause ML model performance degradation. For example, a significant number of users coming from a new acquisition channel can negatively affect the model trained on user data. Chances are that users from different channels behave differently. To get back on track, the model needs to learn new patterns.

![](<../../../images/2023109\_course\_module1\_fin\_images.015.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.015-min.png>)

**Underperforming segments**. A model might perform differently on diverse data segments. It is crucial to monitor performance across all segments.

![](<../../../images/2023109\_course\_module1\_fin\_images.016.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.016-min.png>)

**Adversarial adaptation**. In the era of neural networks, models might face adversarial attacks. Monitoring helps detect these issues on time.

Original file line number Diff line number Diff line change
@@ -14,7 +14,7 @@ It is essential to start monitoring ML models as soon as you deploy them to prod

**Ad hoc reporting** is a great alternative when your resources are limited. You can use Python scripts to calculate and analyze metrics in your notebook. This is a good first step in logging model performance and data quality.

![](<../../../images/2023109\_course\_module1\_fin\_images.061.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.061-min.png>)

## Monitoring frontend

@@ -24,13 +24,13 @@ When it comes to visualizing the results of monitoring, you also have options.

**One-off reports**. You can also generate reports as needed and create visualizations or specific one-off analyses based on the model logs. You can create your own reports in Python/R or use different BI visualization tools.

![](<../../../images/2023109\_course\_module1\_fin\_images.065.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.065-min.png>)

**BI Systems**. If you want to create a dashboard to track ML monitoring metrics over time, you can also reuse existing business intelligence or software monitoring systems. In this scenario, you must connect existing tools to the ML metric database and add panels or plots to the dashboard.

**Dedicated ML monitoring**. As a more sophisticated approach, you can set up a separate visualization system that gives you an overview of all your ML models and datasets and provides an ongoing, updated view of metrics.

![](<../../../images/2023109\_course\_module1\_fin\_images.066.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.066-min.png>)

## Summing up

Original file line number Diff line number Diff line change
@@ -25,6 +25,6 @@ ML model performance metrics help to ensure that ML models work as expected:

The ultimate measure of the model quality is its impact on the business. Depending on business needs, you may want to monitor clicks, purchases, loan approval rates, cost savings, etc. This is typically custom to the use case and might involve collaborating with product managers or business teams to determine the right business KPIs.

![](<../../../images/2023109\_course\_module1\_fin\_images.034.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.034-min.png>)

For a deeper dive into **ML model quality and relevance** and **data quality and integrity** metrics, head to [Module 2](../module-2-ml-monitoring-metrics/readme.md).
Original file line number Diff line number Diff line change
@@ -31,7 +31,7 @@ Accordingly, you also need to track data quality and model performance metrics.

**Ground truth is not available immediately** to calculate ML model performance metrics. In this case, you can use proxy metrics like data quality to monitor for early warning signs.

![](<../../../images/2023109\_course\_module1\_fin\_images.024.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.024-min.png>)

## ML monitoring vs ML observability

@@ -59,7 +59,7 @@ ML monitoring and observability help:
* **Trigger actions**. Based on the calculated data and model health metrics, you can trigger fallback, model switching, or automatic retraining.
* **Document ML model performance** to provide information to the stakeholders.

![](<../../../images/2023109\_course\_module1\_fin\_images.030.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.030-min.png>)

## Who should care about ML monitoring and observability?

@@ -72,7 +72,7 @@ The short answer: everyone who cares about the model's impact on business. At th

Other stakeholders include model users, business stakeholders, support, and compliance teams.

![](<../../../images/2023109\_course\_module1\_fin\_images.031.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.031-min.png>)

## Summing up

Original file line number Diff line number Diff line change
@@ -17,7 +17,7 @@ While setting up an ML monitoring system, it makes sense to align the complexity
* **Feedback loop and environmental stability**. Both influence the cadence of metrics calculations and the choice of specific metrics.
* **Service criticality**. What is the business cost of model quality drops? What risks should we monitor for? More critical models might require a more complex monitoring setup.

![](<../../../images/2023109\_course\_module1\_fin\_images.050.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.050-min.png>)

## Model retraining cadence

@@ -26,7 +26,7 @@ ML monitoring and retraining are closely connected. Some retraining factors to k
* How you implement the retraining: whether you want to monitor the metrics and retrain on a trigger or set up a predefined retraining schedule (for example, weekly).
* Issues that prevent updating the model too often, e.g., complex approval processes, regulations, need for manual testing.

![](<../../../images/2023109\_course\_module1\_fin\_images.052.png>)
![](<../../../images/2023109\_course\_module1\_fin\_images.052-min.png>)

## Reference dataset

Loading