Skip to content

danilyef/machine_learning_in_production

Repository files navigation

Machine Learning in Production

image

The Machine Learning in Production Course is a comprehensive curriculum designed to equip learners with the knowledge and practical skills needed to build, deploy, and manage machine learning systems at scale. The course combines theoretical insights with hands-on assignments to prepare participants for real-world challenges in MLOps (Machine Learning Operations). Below is an overview of the key topics covered in this course:

Course Modules

  1. MLOps Introduction

    • Fundamentals of MLOps and its importance in modern machine learning workflows.
  2. Infrastructure Setup

    • Setting up infrastructure for machine learning projects.
    • Focus on tools, cloud platforms, and deployment environments.
  3. Data Storage and Processing

    • Best practices for managing data at scale.
    • Storage strategies, data preprocessing, and pipelines.
  4. Versioning and Labeling

    • Version control for datasets and models.
    • Effective labeling and validation strategies.
  5. Training and Experimentation

    • Designing robust training pipelines and running experiments.
    • Tools for tracking metrics and improving model performance.
  6. Testing and CI/CD

    • Implementing testing strategies for machine learning systems.
    • Continuous Integration and Continuous Deployment for ML projects.
  7. Orchestration with Kubeflow and Airflow

    • Automating workflows using orchestration tools like Kubeflow and Airflow.
  8. Orchestration with Dagster

    • Advanced orchestration techniques with Dagster.
  9. Serving Basics

    • Fundamentals of serving machine learning models via APIs.
  10. Inference Servers

    • Understanding inference servers and optimizing their performance.
  11. Advanced Serving Features and Benchmarking

    • Advanced serving techniques and benchmarking model performance.
  12. Scaling Infrastructure and Models

    • Techniques for scaling machine learning models and infrastructure to handle production workloads.
  13. Monitoring and Observability

    • Tools and techniques for monitoring ML systems in production.
    • Implementing observability to track model health and data quality.
  14. Tools, LLMs, and Data Moats

    • Exploring state-of-the-art tools and methodologies.
    • Leveraging large language models (LLMs) and building competitive data strategies.
  15. ML Platforms

    • Overview of ML platforms and their role in scaling machine learning operations.

How to start:

  1. Create virtual environment in the root folder:
cd /path/to/your/root/folder
python -m venv env
  1. Activate virtual environment:
source env/bin/activate
  1. Upgrade pip:
python -m pip install --upgrade pip
  1. Install requirements:
pip install -r main_requirements.txt

About

Prjctr: Machine Learning in Production

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages