The Machine Learning in Production Course is a comprehensive curriculum designed to equip learners with the knowledge and practical skills needed to build, deploy, and manage machine learning systems at scale. The course combines theoretical insights with hands-on assignments to prepare participants for real-world challenges in MLOps (Machine Learning Operations). Below is an overview of the key topics covered in this course:
-
MLOps Introduction
- Fundamentals of MLOps and its importance in modern machine learning workflows.
-
Infrastructure Setup
- Setting up infrastructure for machine learning projects.
- Focus on tools, cloud platforms, and deployment environments.
-
Data Storage and Processing
- Best practices for managing data at scale.
- Storage strategies, data preprocessing, and pipelines.
-
Versioning and Labeling
- Version control for datasets and models.
- Effective labeling and validation strategies.
-
Training and Experimentation
- Designing robust training pipelines and running experiments.
- Tools for tracking metrics and improving model performance.
-
Testing and CI/CD
- Implementing testing strategies for machine learning systems.
- Continuous Integration and Continuous Deployment for ML projects.
-
Orchestration with Kubeflow and Airflow
- Automating workflows using orchestration tools like Kubeflow and Airflow.
-
Orchestration with Dagster
- Advanced orchestration techniques with Dagster.
-
Serving Basics
- Fundamentals of serving machine learning models via APIs.
-
Inference Servers
- Understanding inference servers and optimizing their performance.
-
Advanced Serving Features and Benchmarking
- Advanced serving techniques and benchmarking model performance.
-
Scaling Infrastructure and Models
- Techniques for scaling machine learning models and infrastructure to handle production workloads.
-
Monitoring and Observability
- Tools and techniques for monitoring ML systems in production.
- Implementing observability to track model health and data quality.
-
Tools, LLMs, and Data Moats
- Exploring state-of-the-art tools and methodologies.
- Leveraging large language models (LLMs) and building competitive data strategies.
-
ML Platforms
- Overview of ML platforms and their role in scaling machine learning operations.
- Create virtual environment in the root folder:
cd /path/to/your/root/folder
python -m venv env
- Activate virtual environment:
source env/bin/activate
- Upgrade pip:
python -m pip install --upgrade pip
- Install requirements:
pip install -r main_requirements.txt