| # AirTrackLM |
|
|
| A decoder-only transformer for ADS-B air track next-state prediction, adapted from the LLM4STP architecture. |
|
|
| ## Architecture |
|
|
| - **Model**: Custom ~7M parameter decoder-only transformer |
| - **4 Embedding Types**: Geohash (40-bit binary, 3D), Kinematic Features (COG/SOG/ROT/AltRate), Temporal (sub-second sinusoidal), Uncertainty (4 methods + learned heteroscedastic) |
| - **Pretraining**: Next-state prediction (predict all features at t+1 from sequence up to t) |
| - **Coordinate System**: ENU (East-North-Up) with 3-point central derivative for velocity computation |
|
|
| ## Uncertainty Methods |
|
|
| 1. **Kinematic Variance** β Sliding-window variance of COG/SOG/ROT/alt_rate |
| 2. **Prediction Residual** β Deviation from constant-velocity prediction model |
| 3. **Spatial Density** β Data coverage proxy (fewer nearby training points = higher uncertainty) |
| 4. **Flight Phase Entropy** β Entropy of phase classification in a window (mixed phases = uncertain) |
| 5. **Learned Heteroscedastic** β Model predicts its own log-variance per output head (aleatoric) |
| 6. **MC-Dropout** β Monte Carlo dropout at inference for epistemic uncertainty |
| |
| ## Features |
| |
| - **Inputs**: Raw ADS-B (lat, lon, alt, timestamp) |
| - **Derived**: COG, SOG, ROT, altitude rate via 3-point central derivative on ENU positions |
| - **Geohash**: 40-bit binary encoding per axis (E, N, U) = 120-bit 3D position token |
| - **Temporal**: Sinusoidal second-of-day (sub-second resolution) + calendar embeddings + Ξt encoding |
| - **Output Heads**: Binary geohash prediction, continuous Ξ-ENU regression, COG/SOG/ROT/AltRate bin classification |
| |
| ## Data |
| |
| Training data from the `traffic` Python library (real ADS-B surveillance data). |
| |
| ## Files |
| |
| - `model.py` β Full model architecture (AirTrackLM, embeddings, loss functions) |
| - `data_pipeline.py` β ENU conversion, 3-point derivatives, geohash encoding, dataset |
| - `uncertainty.py` β 6 uncertainty quantification methods |
| - `train.py` β Training utilities |
| - `train_full.py` β Full GPU training script with Hub push |
| - `ARCHITECTURE.md` β Detailed architecture document |
|
|
| ## Based On |
|
|
| - **LLM4STP** (Joker-hang/LLM4STP) β Binary geohash encoding, GPT-2 backbone concept |
| - **FTP-LLM** (arXiv:2501.17459) β LLM for flight trajectory prediction |
| - **H3-CLM** (arXiv:2405.09596) β Hexagonal geohash + causal LM for maritime trajectories |
| - **GeoFormer** (arXiv:2311.05092) β GPT-style geospatial tokenization |
|
|