realtime data in public transport

​cancellations
delays
positions
notices/warnings
why?
    without it, public transport doesn't feel reliable enough
    long-term: frustration accumulates, people stop using it (if they have the choice)
    why not?! part of service: A->B, on time, reliably, comfortably

existing (legacy) APIs

not very oriented on travellers' needs:
    focused on routing
    focused on "ahead-of-time" inquiries
    not suited for travel/commute companions
prevent "new" use cases
    custom routing
    dashboards
not open
    technically: proprietary APIs/formats, lacking documentation
    legally: signing contracts not always an option
    non-local actors won't even bother
often only request-based, no streaming access
    rate limits
    waste of resources
    (latency)
applies GTFS-static concepts to realtime data
    GTFS-static
        when is a bus supposed to stop where?
        .zip archive of CSV files
    GTFS-realtime
        bus on time, cancelled?
        notices/warnings, e.g. "today no step-free access"
        where is the bus?
        (congestion)
        binary, Protocol Buffers (pbf)
        mostly same naming as GTFS-static
    how to consume feed?
        common variant: "dump" mode
        differential mode (draft)

HAFAS

public transport backend & mobile app system
many customers -> many instances
    e.g. VBB & BVG, HVV, DB Navigator, Nah.SH, SaarVV, VVS
    see also transport-apis
hafas-client
    built-in support for many endpoins
many other clients
mobile "mgate.exe" API
    abbreviated field names
    built for mobile apps
"rest.exe" API
    intended for the public
public, truly open
    no authentication
    rate limit
target group:
    quick experiments, scripting, hackathons
        e.g. `curl 'https://v5.hvv.transport.rest/stops/6237/departures' | jq '.[0].when'`
    less technical/experienced users:
        beginners, students
        no-code environments?
    unified API across data sources
        eventually deprecated by another standard
wrapper APIs around archaic, hard-to-use, closed APIs
    no duplicate reverse engineering & guesswork
thin wrappers:
    berlin-gtfs-rt-server: VBB HAFAS + hafas-gtfs-rt-feed
    hamburg-gtfs-rt-server: HVV HAFAS + hafas-gtfs-rt-feed
demo using gtfs-rt-inspector
1. poll HAFAS to get all vehicles/"runs"
    hafas-monitor-trips
    all vehicles in bounding box, every 3min
    refresh vehicles/"runs" every 60s
2. match runs against GTFS-Static data
    match-gtfs-rt-to-gtfs
    PostgreSQL DB via gtfs-via-postgres
    fuzzy matching:
        `S+U Warschauer Str.` -> `warschauer strasse`
        `Warschauer Straße (Bln)` -> `warschauer strasse`
    -> realtime data with GTFS-Static IDs
3. serve as dump
    newer msg for same run replaces older
    encode all as one pbf "dump"
    serve via HTTP, with caching headers

next steps

more regions!
    built with configurable endpoint in mind
    server capacity?
    other APIs than HAFAS
        trias-client
        kpublic-transport
        public-transport-enabler
official feeds?
    more efficient conversion
    more visibility -> greater impact -> indirect effects
federation of data sources?
    stops/lines/routes with stable IDs
        generate IDs from properties, e.g. normalized names & locations
            pan-european-public-transport
        shared catalogs of IDs
        Linked Connections
    aggregation of differential msgs

realtime data in public transport​

By Jannis R

realtime data in public transport​

  • 724