The project has been made in the scope of my training at Codecademy.com.
This is final project of the Data Science Career Path. The subject is left up to the trainee.
I choose a dataset from French Data governments under an open license.
Data: https://www.data.gouv.fr/fr/datasets/emissions-de-co2-et-de-polluants-des-vehicules-commercialises-en-france/
License: https://www.etalab.gouv.fr/wp-content/uploads/2014/05/Licence_Ouverte.pdf
This data contains the following informations for all commercialized vehicles in France in 2014:
- fuel consumption
- carbon dioxide (CO2) emissions
- emissions of air pollutants (regulated under the Euro standard)
- all the technical characteristics of the vehicles (ranges, brands, models, CNIT number, type of energy, etc.)
The objective is, after an analysis of the dataset, to build a machine learning model in order to predict what could be the CO2 emissions of a new car according to its characteristics.
In addition, we will also look at the weight of each characteristics in the prediction of CO2 emissions.
- python
- pandas
- scikit learn
- seaborn
Build, train, test:
- Linear Regression
- Multi Layer Perceptron
- K Neighbors Regressor
Find the code and explainations in the jupyter notebook file: cars_co2_emissions.ipynb