2D and 3D multivariate regressing with sklearn applied to cimate change data
Winner of Siraj Ravel's coding challange
The notebook is split into two sections:
- 2D linear regression on a sample dataset [X, Y]
- 3D multivariate linear regression on a climate change dataset [Year, CO2 emissions, Global temperature].
Because of the small amount of data, and the random 10% of data chosen for testing, the scores have high variance.
2D Linear Regression | 3D Multivariate Linear Regression |
---|---|
R2 (Score): 0.651237006724 | R2 (Score): 0.968933216107 |
Run the jupyter notebook linear_regression.ipynb
##Challenge
The challenge for this video is to use scikit-learn to create a line of best fit for the included 'challenge_dataset'. Then, make a prediction for an existing data point and see how close it matches up to the actual value. Print out the error you get.
Bonus points if you perform linear regression on a dataset with 3 different variables
- matplotlib
- pandas
- numpy
- seaborn