Welcome, and thank you for opening this Project. This project contains a jupyter notebook which will provide knowledge to novice Data Scientists with basic Data Analysis/Machine Learning concepts like:
- Data Extraction
- Downloading a publicly available dataset
- Describing the dataset
- Describing the research question
- Data Pre-processing
- Cleaning/removing invalid values from rows
- Cleaning up columns
- Removing/filling missing data
- Creating new columns
- Modifying exsting columns
- Data Visualization
- Data Exploratory Analysis
- Descriptive Analytics
- Prediction and Model Selection
- Classification
- Deriving Conclusion/Insights from the data
Name: Red Wine Quality Data Set
Source: UCI Machine Learning Repository
Input variables:
- fixed acidity
- volatile acidity
- citric acid
- residual sugar
- chlorides
- free sulfur dioxide
- total sulfur dioxide
- density
- pH
- sulphates
- alcohol
Output variable: quality (score between 0 and 10)
Data Set Characteristics: Multivariate
Number of Observations: 1599
Number of Attributes/Variables: 12
Missing Values: N/A