AirBnB_Seattle_DataAnalysis

This project aims to analyse data provided by AirBnB website for the city of Seattle in USA. It can also be used to analyse data from other cities on AirBnB website.

Motivation

we use the CRISP-DM process(Cross Industry Standard Process for Data Mining) to answer the questions asked about the AirBnB data which can be used for their data analysis.

Libraries

There are several questions asked in the notebook file and they are answered step by step using different libraries such as Matplotlib visualizations,seaborn, Pandas dataFrames and sklean(scikit-learn) library.

Models

There are 2 models used for prediciting price of the properties using sklearn library. they are

Ridge Regression
RandomForest Regresser

Results

We have provided almost 60% r-squared score for test data and 90% score for training data. they can be improved by providing more data for prediction(larger data)as the number of columns used for predicition are high after we make categorical variables dummies.

The notebook file Data_Analysis_airbnb contains all the python code for analysis. we have also provided the three data files listings,calendar and reviews on the repository.

Business Questions

In the code we have trided to answer the following questions:

What features most influence on the price of the properties?
How many properties are available during the Year? How prices change during the calendar period?
which streets in Seattle host more guests than other streets? What is the average price of these popular streets?

Conclusion

There are lots of other questions which can be answered using this data and also improve our prediction... The listings data is used for predicting the price of the properties, but our test and train data predictions do not match and we think that because of the large columns (~1700 columns), the number of the rows(the size of the data) should be much bigger in order to get better predictions.

Acknowledgement

Many Thanks to Udacity for providing me this opportunity to work on this data and all the support.

This project is open for public and I appreciate any comments and any developments that can help to provide better results from analysing the data.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Cooffending.csv		Cooffending.csv
Data_Analysis_airbnb.ipynb		Data_Analysis_airbnb.ipynb
README.md		README.md
calendar_seattle.csv		calendar_seattle.csv
listings_Seattle.csv		listings_Seattle.csv
reviews_seattle.csv		reviews_seattle.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AirBnB_Seattle_DataAnalysis

Motivation

Libraries

Models

Results

Business Questions

Conclusion

Acknowledgement

About

Releases

Packages

Languages

JayQuantRise20/AirBnB_Seattle_DataAnalysis

Folders and files

Latest commit

History

Repository files navigation

AirBnB_Seattle_DataAnalysis

Motivation

Libraries

Models

Results

Business Questions

Conclusion

Acknowledgement

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages