CrowdAware Transit is an UPPER Hackathon project, presented with this .pdf. The idea of the project is to share expected fullness of the bus next to arrival time. It would allow people that require a seat, big families and other passengers to plan their trips with comfort.
Here are examples of current EMT app UI and improved one:
vs
What we want is to estimate how many passengers will exit and enter the bus at specific time on stops between bus and passenger's stop (2nd and 3rd bus stops):
Here is how it can be done (consider that data has info only about enters, so it may be overcomplicated if we have collected data exits as well):
NOTE: This step is done to historical data to generete statistical information about exits.
This is done by simple algorithm, that founds round trips (trips with returns) and considers station(bus stop) where passenger entered to return as station where passenger exited while going there. In the notebook that I created at hackathon, this is done like that:
- For each bus line create one unit vector, that represents average direction of the line. (where unit vector*(-1) is the same line but opposite direction). This is done by sum of vectors between stops of the line and division by vector length.
- Taking this unit vector of trip n and n+1 of passenger x we calculate angle between these vectors.
- If angle is bigger than pi/3, it means that it is most likely returning trip.
- If time between two trips is less than 1h and angle < pi/3, it is most likely just line changing and still trip in one direction (not returning).
- In case if passenger returned on other line (for example equivalent line) we search closest stations: for the station where passenger entered while going there we search closest station at the line on which passenger returned and reversed action.
- And then having these pairs we can can predict exit time of each trip.
- boom, we have complete (or almost complete) historical data about enter and exit time and location.
This is simple, here we forecast for each bus stop and each bus line how many passengers will enter or exit in the future based on historical (+real-time, if we have it) data.
We don't create any wheel by making such algorithm from 0, but rather we use foundation model for time-series forecasting. I've chosen lag-llama model as it lets us fine-tune it.
So the workflow would be:
- Prepare the data, make features for the lag-lama model (for example nuber of passengers that entered in period of 20 min for one month), NOTE: of course we do computation for exits and enters separately
- Just use the model and get results (in the notebook I had no time to finish this part, but it is easy you can try make it)
Well, we just get real-time data on how many passengers entered on each stop and minusing previously forecasted data on how mahy passengers would exit on each stop (and also we can edit it for example based on anomalies in real-time enters data) and the result is how many passengers are in the bus right now, for estimating it for future stops (between bus and passenger) we do forecast for how many passengers will enter on each stop and minusing the same as earlier but for other stops and times.
Thats it, later other data like weather or some big events can be added to the process for better accuracy but for MVP it is enough.
Feel free to use my code and idea. Would appreceate if you refer me or took me on a job :)