First individual project at the neuefische Data Science bootscamp performing EDA.
The overall goal is to perform an Exploratory Data Analysis (EDA) of the King County Housing Data having in mind the requests of a hypothetical stakeholder named Nicole Johnson, who seeks to buy a house for a "lively, central neighborhood, middle price range, right timing (within a year)". The walk through can be found in EDA.ipynb.
- pyenv
- python==3.9.8
- jupyterlab==3.2.6
- ipywidgets==7.6.5
- matplotlib==3.5.1
- pandas==1.3.5
- numpy==1.22.0
- jupyterlab-dash==0.1.0a3
- seaborn==0.11.1
First of all, we want to create a new virtual environment with the most basic python modules. Therefore we are (either with 1 or 2):
- setting the python version locally to 3.9.8
- creating a virtual environment using the
venv
module - activating our newly created environment
- upgrading
pip
(optional, but handsome) - installing the required packages via
pip
The above modules are listed in a separate requirements file
. Therefore, we just need to run the following command to set up the virtual environment and the required packages like this:
pyenv local 3.9.8
python -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install -r requirements.txt
- upgrade the
conda
package manager to the latest version - creating a virtual environment named venv_nf-p1 with set python version
- activating our newly created environment
- installing the required packages via
conda
from channel conda-forge
conda update conda
conda create -n venv_nf-p1 python=3.9.8
conda activate venv_nf-p1
conda install -c conda-forge --file requirements.txt
The dataset for the population density for the notebook zipcode_data_king_county.ipynb is stored in the data.zip
folder. To unzip the data folder directly in the terminal run
unzip data.zip
For the other datasets links are given within the notebooks.