UPenn CIS520 Machine Learning class project: utilize different machine learning approaches to predict the hotel booking's cancellation.
Authors: Yifei Li, Zhijian Yang
- PDF: Project Report
- Notebooks: Dataset Cleaning and Visualization
- Notebooks: Basic Models Experiment
- Notebooks: Advanced Models Exploration
- Dataset: Raw [1]
- Dataset: Processed
The latex code of report is here; and the related kaggle is here.
Binary classification:
- Predict:
IsCancelled
or not: y={0,1} with dim=2 - Observations: processed feature set: X with dim=194
- Deep Factorization-Machine (half-done)
- Soft-Voting Ensemble Estimator
- Neural Network (vanilla and tuned)
- Random Forest (vanilla and tuned)
- Decision Tree
- XGBoost
- AdaBoost
- Extra Trees
- SVM (vanilla and tuned)
- Logistic Regression (baseline)
- Source of the background picture: Four Seasons Hotel in Guangzhou