Case Study: SFO Air Traffic Passenger Statistics
- Jacques Sham (@jacquessham)
- Charles Siu (@chunheisiu)
This group project visualized various passenger statistics of San Francisco International Airport (SFO), using the dataset published by the SF Airport Commission through DataSF. It utilized R
for scripting and ggplot
for visualization. It is part of the coursework for BSDS 100 Intro to Data Science with R class at the University of San Francisco.
The dataset we sourced from DataSF includes destination, origin, airlines, terminals, and passenger count between July 2005 and December 2017. The dataset contains 17,959 rows and 12 columns. The dataset is available here and the data dictionary is available here.
For the purpose of the project, we performed data cleansing to fix incorrect and inconsistent data entries. After that, we created a few visualizations using ggplot
that aims to provide insights for the following about SFO:
- Average monthly passengers traffic between 2006 and 2017
- Passengers traffic by destination/origin regions
- Overview on passengers traffic by domestic airlines
- Passengers traffic traveled by Low Cost Carrier
- Passengers traffic in airport terminals
- Passengers traffic on 1 selected domestic carrier
-
Bar Chart: Monthly Average Passenger Traffic between 2006 and 2017
-
Stacked Bar Chart: Annual Passenger Traffic on International Low Cost Carriers
-
Tree Map: Domestic Passenger Traffic of Airline and Terminal
The detailed version of the report could be viewed in PDF format. Additionally, the rmd
source code is available here and the presentation slides are available here.