The project is done within the context of the Data Mining class given by Prof. Mahmoud Sakr at the ULB. It aims at supporting the brussels public transport system at asssessing the quality of its network using data mining solutions. Further explanation can be found within the hack_my_ride.pdf
file.
To summarize, three inputs were provided :
- General Transit Feed Specification (GTFS) files were providing the theoretical schedule.
- JSONs observations of vehicle positions every 30sec across the network.
- Shapefiles regarding the geography of the network (stops and lines).
As such, two main tasks aimed at clustering specific times of the day to chose which metrics to assess and evaluating which stops were major delayers and catchers across all the network :
- Change point detection - PELT modeling for intraday clustering
- Frequent pattern mining - FP growth on slopes of the median delay evolution through sequences of stops in a specific lines for detecting problematic segments in the network
Finally to visualise the reuslts, a dashboard has been develop to visualize the results.
This repo contains all the necessary functions to reproduce the results. However, the dashboard only will be covered within this readme. For further explainations, feel free to reach me at [email protected].
- Feel free to contact me to get the data files. Clone the repo and create a
data
folder containing the data I sent. - Install requirements using
pip install -r requirements.txt
. - Launch the dashboard by going to the root of the project in a terminall session and running
python app.py
- Enjoy on your localhost ! 😊
Distributed under the MIT License. See LICENSE.txt
for more information.
- Hakim Amri: [email protected]
- Rania Baguia: [email protected]
- Abdelmoumen Oumahi : [email protected]
- Mehdi Jdaoudi : [email protected]