Skip to content

Latest commit

 

History

History
116 lines (79 loc) · 9.02 KB

README.md

File metadata and controls

116 lines (79 loc) · 9.02 KB

General

This system will read real-time data from various sources (for example, Pubtrans database regarding departures and arrivals) and convert those to GTFS real-time messages using a pipeline created with Apache Pulsar. The final step will publish the messages to different locations (such as MQTT brokers and blob storage).

General usage pattern is to build Docker images and then run them with docker-compose. Services are separated to different GitHub repositories, each containing the source code and the Dockerfile.

/bin-folder contains scripts to launch Docker images for Pulsar, Redis and Mosquitto MQTT broker which are requirements for some of the services.

Requirements

Overall system requirements for running the system are:

  • Docker
  • Redis
  • Pulsar
  • Connection to a Pubtrans SQL Server database
  • Connection to an MQTT broker

System Architecture & Components

Transitdata

Alt text

Transitdata input

  • mqtt.hsl.fi: vehicle positions in HFP format (all vehicles)
  • hsl-mqtt-lab-a.cinfra.fi: estimates for stop time (metros)
  • Pubtrans ROI: estimates for stop time (buses, trams, trains)
  • Pubtrans DOI: static data (schedule, stops, routes)
  • OMM DB: service alerts (cancellations, disruptions)
  • sm5.rt.hsl.fi: EKE message from SM5 trains
  • apc.rt.hsl.fi: Passenger count data

Transitdata output

  • MQTT Broker cmqttdev.cinfra.fi: vehicle position in GTFS-RT format (HSL displays at stops)
  • Azure Blob storage: used for publishing vehicle position, trip updates and service alerts in GTFS-RT format (for Google Maps and 3rd-party applications) and for archiving messages (HFP and EKE) in CSV files
  • MQTT Broker pred.rt.hsl.fi -> Reittiopas.fi: stop estimates in GTFS-RT format
    • Note: this MQTT broker is intended to be used by HSL systems only. Its functionality can be changed without a notice and there is no guarantee that it will work for third-party applications.
  • Graylog server: logs from all the microservices.

Transitlog

Alt text

Components are stored in their own Github Repositories:

Common dependencies

  • transitdata-common - Contains Protobuf definitions, shared constants and generic components, such as an abstract class for connecting to Pulsar

Transitdata components

Sources
Processors
Publishers
Other

These components are not connected to the Pulsar cluster, but they are deployed to the same environment as Transitdata and they produce data that Transitdata uses

  • suomenlinna-ferry-hfp - Creates HFP messages for Suomenlinna ferries from AIS data
  • gtfsrt2hfp - Creates HFP messages from GTFS-RT vehicle positions, currently used for U-bus 280
Monitoring and testing

Transitlog HFP components

Note: this list does not contain all transitlog services. Search for transitlog in hsldevcom GitHub

Versioning

All of the main components in this project are versioned with the following scheme: x.y.z, where x is always 1, y is incremented when the output of the component is not backwards-compatible and z is incremented when the output is compatible. Most of the internal message protobufs include field for schema version, which can be used to make sure that incompatible messages are not processed. Because the services are deployed together at the same time, usually it is possible to just do changes to all necessary services at the same time.

The GTFS-RT output should conform to the GTFS Realtime standard, version 2.0.

Implementation notes

Pulsar seems to cause approximately 5ms of latency for each message, which is consistent with their promise. The latency is not a problem in itself, and is well within acceptable bounds. However, the latency means that a single-threaded consumer-producer loop can only process 200 messages per second.