Skip to content

Latest commit

 

History

History
39 lines (26 loc) · 2.16 KB

README.md

File metadata and controls

39 lines (26 loc) · 2.16 KB

K-Means Clustering with Python and Olympian Data

This repository contains a Jupyter notebook which can be used in a workshop about k-means clustering using the 120 years of Olympic history: athletes and results dataset available on Kaggle.

Installation/Set-up

  • You will need miniconda (or the full anaconda) for Python 3.7. Allow it to prepend the install location to your path.
  • (Don't forget to source your .bash_profile so bash can find the conda binary!)
  • Clone this repo
  • Using the environment.yml file, create a new conda environment: conda env create -f environment.yml
  • To activate the environment, run source activate myenv.
  • To test that everything works, run jupyter notebook and navigate to localhost:8888/ in your browser. You should see an interface like this:

Jupyter Notebook Screenshot

Working with the Jupyter Notebook

There are two versions of this notebook:

  • olympic_kmeans_follow_along.ipynb lets you follow along, filling in the code as you go.
  • olympic_kmeans.ipynb is the full notebook, with answers if you get stuck

Click on the notebook you wish to run.

Inside each notebook are several cells. When interacting with the cells, you can either be in:

  • Edit Mode (green border) for editing cells. Selecting a cell and hitting ENTER will put you in Edit Mode.

Edit Mode

  • Command Mode (blue border) for running cells. Hitting ESCAPE on a cell in Insert Mode will put you back in Command Mode.

Command Mode

To run a selected cell, you can either hit the "Run" button in the top menu bar or by hitting Shift+Enter in Command Mode.

Sources