Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added new project for explainability #126

Merged
merged 10 commits into from
Aug 20, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added explainability-shap/.assets/model.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions explainability-shap/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
!/materializers/**
!/pipelines/**
!/steps/**
!/utils/**
15 changes: 15 additions & 0 deletions explainability-shap/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Apache Software License 2.0

Copyright (c) ZenML GmbH 2024. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
81 changes: 81 additions & 0 deletions explainability-shap/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# 🌸 Iris Classification MLOps Pipeline with ZenML

Welcome to the Iris Classification MLOps project! This project demonstrates how to build a production-ready machine learning pipeline using ZenML. It showcases various MLOps practices including data preparation, model training, evaluation, explainability, and data drift detection.

## 🌟 Features

- Data loading and splitting using scikit-learn's iris dataset
- SVM model training with hyperparameter configuration
- Model evaluation with accuracy metrics
- Model explainability using SHAP (SHapley Additive exPlanations)
- Data drift detection between training and test sets
- Artifact and metadata logging for enhanced traceability

<div align="center">
<br/>
<img alt="Iris Classification Pipeline" src=".assets/model.gif" width="70%">
<br/>
</div>

## 🏃 How to Run

Before running the pipeline, set up your environment as follows:

```bash
# Set up a Python virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install requirements
pip install -r requirements.txt
```

To run the Iris Classification pipeline:

```shell
python run.py
```

## 🧩 Pipeline Steps

1. **Load Data**: Loads the iris dataset and splits it into train and test sets.
2. **Train Model**: Trains an SVM classifier on the training data.
3. **Evaluate Model**: Evaluates the model on the test set and generates predictions.
4. **Explain Model**: Generates SHAP values for model explainability.
5. **Detect Data Drift**: Detects potential data drift between training and test sets.

## 📊 Visualizations

The pipeline generates a SHAP summary plot to explain feature importance:

<div align="center">
<br/>
<img alt="SHAP Summary Plot" src=".assets/shap_visualization.png" width="70%">
<br/>
</div>

## 🛠️ Customization

You can customize various aspects of the pipeline:

- Adjust the `SVC` hyperparameters in the `train_model` step
- Modify the train-test split ratio in the `load_data` step
- Add or remove features from the iris dataset
- Implement additional evaluation metrics in the `evaluate_model` step

## 📜 Project Structure

```
.
├── run.py # Main pipeline file
├── requirements.txt # Python dependencies
└── README.md # This file
```

## 🤝 Contributing

Contributions to improve the pipeline are welcome! Please feel free to submit a Pull Request.

## 📄 License

This project is licensed under the Apache License 2.0. See the LICENSE file for details.
7 changes: 7 additions & 0 deletions explainability-shap/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
scikit-learn
shap
matplotlib
scipy
zenml
pyarrow
fastparquet
Loading
Loading