This repository contains the R project analysis of the KC House Sales dataset, aimed at predicting house sales in King County, Washington State, USA, utilizing Multiple Linear Regression (MLR). The data, sourced from Kaggle datasets under the name "KC_Housesales_Data", comprises historical data of houses sold between May 2014 to May 2015.
The dataset can be found at: KC House Sales Data on Kaggle
The project involves the following key steps:
- Data preprocessing and exploration to understand the dataset characteristics.
- Exploratory Data Analysis (EDA) to identify patterns, outliers, and relationships between variables.
- Data visualization to support EDA findings.
- Building a Multiple Linear Regression model to predict house prices.
- Evaluating model performance and comparing it with a one-layer forward neural network as a reference.
tidyverse
for data manipulation and visualization.corrplot
for visualizing correlations between variables.lubridate
for date-time manipulation.caTools
,GGally
,caret
, andleaps
for various stages of model building and evaluation.
To replicate or extend the analysis:
- Download the dataset from Kaggle.
- Install the required R libraries mentioned above.
- Run the R Markdown file
myproject.Rmd
for step-by-step execution of the analysis.
Data provided by Kaggle datasets. Analysis and model building by Teoman Selcuk as part of the MTH 404 R Project.
*Note: For detailed understanding and insights, users are encouraged to go through the R Markdown file myproject.Rmd
.