Welcome to the repository for the Machine Learning Lecture at HTW Berlin. This repository contains Jupyter notebooks used throughout the course.
notebooks/
: Contains all the Jupyter notebooks used in the lecture. Each notebook covers different machine learning topics, exercises, and examples.
The notebooks cover a wide range of machine learning topics, including but not limited to:
- Data Preprocessing
- Supervised Learning Algorithms
- Model Evaluation Metrics
- Feature Engineering
- Hyperparameter Tuning
- Advanced Topics like Ensemble Methods and Neural Networks
To get started with these notebooks, you'll need to set up the conda environment specified in environment.yml
. You can also use another package manager for virtual environments if you like. Ensure you have Anaconda or Miniconda installed on your system.
-
Clone the Repository
git clone https://github.com/HTW-Berlin-KI-Werkstatt/ml-lecture-exercise.git cd ml-lecture-exercise
-
Option 1: Create the Environment with Conda
Create the environment (python 3.9 is compatible with torch):
conda create -n ml-exercise-env python=3.9
Activate the newly created environment:
conda activate ml-exercise-env
-
Option 2: Create the Environment with VirtualEnv
Create the environment with venv as follows (please use python 3.9):
python -m venv venv
and activate the environment
source venv/bin/activate # or venv/bin/activate
-
Install packages
Install all packages with
pip
:pip install -r requirements.txt
-
Launch Jupyter Notebook
Start the Jupyter Notebook server:
jupyter notebook
Navigate through the browser to access and run the notebooks available in the repository. Alternatively you can use jupyter notebooks within VS-Code.
Since the notebooks are designed for solving tasks and experimentation, it can be reasonable to copy the repective notebooks beforehand and leave them untracked in the repository. Solutions should be not part of the repository :)
To effectively use Git with Jupyter Notebooks, it's important to handle version control efficiently. Jupyter notebooks are JSON files, and merging or viewing differences between versions in plain text can be challenging.
To maintain clean versions of Jupyter notebooks in your Git repository, you can use nbstripout
. This tool strips output from the notebook files before committing them, minimizing merge conflicts and keeping the repository size down.
-
Install
nbstripout
You can install
nbstripout
using pip:pip install nbstripout
-
Configure
nbstripout
with GitTo automatically strip outputs from your notebooks when committing to a specific repository, enable
nbstripout
as a Git filter:nbstripout --install
Contributions are welcome! If you find any issues or have suggestions for improvements, feel free to open an issue or submit a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.
For questions or further information, please contact Erik Rodner.