Analysis of the variables that factor into the pricing of real estate in Mexico and Brazil to determine the relationship between these values and house prices.
Among the data in the dataset are region and state classification, house type, area, price, and precise geographic location.
Data visualization tools, summary statistics, and statistical measures are used to quantify the relationship between these variables and house prices.
The analyses conclude with a notebook that uses the tools of predictive data science to predict apartment prices in Buenos Aires, Argentina. This is done by:
- creating a linear regression model using the scikit-learn library
- building a data pipeline for imputing missing values and encoding categorical features
- improving the model performance by reducing overfitting
- creating a dynamic dashboard for interacting with the completed model