Tube Next-Hop Policy
A GATv2 encoder + MLP decoder trained to predict optimal next-hop routing decisions on the London Underground graph. Given a current station and a destination, the model outputs a probability distribution over adjacent stations.
Greedy rollout achieves 1.09× Dijkstra ratio (optimal shortest travel time) with 100.0% success rate across all 239,610 origin–destination pairs.
Intended Use
This model is a research artifact and demonstration of learned graph routing. It predicts the next station to visit on a shortest-time path between any two London Underground stations, using only the graph topology and travel-time edge weights from TfL's GTFS timetable feed.
Potential applications include learned routing components in transit planning systems, GNN routing benchmarks, and educational demonstrations of graph-attention-based policy learning.
Architecture
| Component | Details |
|---|---|
| Encoder | 16-layer GATv2, d=512, 8 heads |
| Decoder | 3-layer MLP with adjacency masking |
| Graph | 490 stations, 19 lines, 1752 directed edges |
| Parameters | 10,168,299 |
| Training signal | KL divergence vs. Q-soft targets from Floyd–Warshall |
| Label smoothing | 0.1 |
| Scheduled sampling | 0.5 |
Evaluation Results
Evaluated on a held-out set of 23,961 OD pairs (489,981 next-hop steps), with greedy rollout to completion.
| Metric | Value |
|---|---|
| Rollout success rate | 100.0% |
| Dijkstra ratio (travel time vs. optimal) | 1.09 |
| Step accuracy (single-step top-1) | 79.1% |
| Length ratio (hops vs. optimal hops) | inf |
Note on step accuracy: The 79.1% top-1 step accuracy reflects that many nodes have multiple equally-optimal next hops. The model distributes probability across these alternatives, which is correct behavior — the rollout metrics confirm optimal routing.
Training Data
The graph topology and edge travel times are extracted from Transport for London's GTFS timetable feed. Floyd–Warshall all-pairs shortest paths provide the Q-value supervision signal.
- 239,610 OD pairs → 4,891,856 next-hop training steps
- 90/10 OD-pair split (215,649 train / 23,961 val)
- Batch size: 4,096 steps
Training Details
- Optimizer: AdamW, lr=3e-04
- Compiled with
torch.compile(mode='default') - Training time: ~58 minutes
- Hardware: single GPU
Limitations
- The model is specific to the London Underground graph topology at the time of the GTFS snapshot. It will not generalize to other transit networks without retraining.
- Edge weights represent scheduled travel times, not real-time conditions.
- The adjacency mask is fixed at inference time — the model cannot handle station closures or service disruptions without mask modification.
Usage
pip install tubeulator-models[inference]
from tubeulator_models import TubeRouter
router = TubeRouter.from_pretrained("permutans/tube-nexthop-policy")
route = router.route("West Ham", "Shoreditch")
print(route)
# West Ham
# → [district] Bromley-by-Bow (2.0m)
# ...
# ✓ 8 hops · 2 lines · 1 transfer · 18.0 min
# With waypoints
route = router.route("Camden Town", "Canary Wharf", via=["King's Cross"])
For CLI usage:
pip install tubeulator-models[cli]
tm-infer policy --model permutans/tube-nexthop-policy -o "West Ham" -d Shoreditch
Links
- Code: tubeulator-models
- Companion model: Distance field (value head) for travel-time estimation
Part of the Model Trains series. Trained on GTFS timetable data from Transport for London.
- Downloads last month
- 23
Evaluation results
- Rollout Success Rate on TfL GTFS Timetable (London Underground)self-reported1.000
- Dijkstra Ratio on TfL GTFS Timetable (London Underground)self-reported1.088
- Step Accuracy on TfL GTFS Timetable (London Underground)self-reported0.791
- Length Ratio on TfL GTFS Timetable (London Underground)self-reportedinf