AirTrackLM: LLM4STP Adapted for ADS-B Air Track Prediction
Complete Architecture & Implementation Plan
1. Executive Summary
We adapt the LLM4STP multi-feature fusion architecture (originally for maritime AIS ship trajectory prediction) to work with ADS-B air track data. The model uses a decoder-only transformer with four specialized embedding types β Prompt, Uncertainty, Geohash, and Temporal β fused together for next-state prediction pretraining. Once pretrained, the model is adaptable to downstream tasks like activity classification.
This design is grounded in published results from:
- FTP-LLM (arXiv:2501.17459) β LLaMA-3.1-8B for flight trajectory prediction
- H3-CLM (arXiv:2405.09596) β H3 geohash + causal LM for maritime trajectories
- GeoFormer (arXiv:2311.05092) β GPT-style geospatial tokenization
- TrAISFormer (arXiv:2109.03958) β Discrete tokenization of AIS features
2. System Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RAW ADS-B INPUT β
β (timestamp, latitude, longitude, altitude) β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FEATURE DERIVATION PIPELINE β
β β
β Raw: lat, lon, alt β
β Derived: COG, SOG, ROT, altitude_rate β
β Meta: timestamp β (hour, day_of_week, month) β
β β
β Output per timestep: β
β state_t = [lat, lon, alt, COG, SOG, ROT, alt_rate] β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TOKENIZATION / ENCODING β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Geohash β β Continuous β β Temporal β β
β β Tokenizer β β Discretizer β β Encoder β β
β β β β β β β β
β β lat,lon,alt β β COG,SOG,ROT β β hour,dow, β β
β β β H3 cell + β β alt_rate β β month β β
β β alt_band β β β bin IDs β β β time IDs β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Geohash β β Feature β β Temporal β β
β β Embedding β β Embeddings β β Embedding β β
β β Table β β Tables β β Table β β
β β (d_model) β β (d_model) β β (d_model) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
ββββββββββββΌββββββββββββββββββΌββββββββββββββββββΌβββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EMBEDDING FUSION LAYER β
β β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββββ β
β β Geohash β β Feature β β Temporal β β Uncertainty β β
β β Embed β β Embed β β Embed β β Embed β β
β β (d_model) β β (d_model) β β (d_model) β β (d_model) β β
β βββββββ¬βββββββ βββββββ¬βββββββ βββββββ¬βββββββ ββββββββ¬ββββββββ β
β β β β β β
β ββββββββββββ¬ββββ΄βββββββ¬ββββββββ β β
β β β β β
β βΌ βΌ βΌ β
β E_state = E_geo + E_feat + E_temp + E_uncert β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β Prompt Embedding (prepended prefix) β β
β β [PROMPT_1, PROMPT_2, ..., PROMPT_k] β β
β βββββββββββββββββββββ¬ββββββββββββββββββββββββ β
β β β
β βΌ β
β Input: [PROMPT_TOKENS | STATE_1 | STATE_2 | ... | STATE_T] β
β β β
β βΌ β
β Linear Projection β d_model β
β β β
β βΌ β
β + Positional Encoding (sinusoidal) β
β β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DECODER-ONLY TRANSFORMER BACKBONE β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Transformer Block ΓN_layers β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β Causal Multi-Head Self-Attention β β β
β β β (masked: each position attends only β β β
β β β to itself and earlier positions) β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β LayerNorm + Residual Connection β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β Feed-Forward Network β β β
β β β (Linear β GELU β Linear) β β β
β β β d_model β 4*d_model β d_model β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β LayerNorm + Residual Connection β β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OUTPUT HEADS β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PRETRAINING: Next-State Prediction Head β β
β β β β
β β For each position t, predict state at t+1: β β
β β β β
β β h_t β Linear β softmax β P(geohash_token_{t+1}) β β
β β h_t β Linear β softmax β P(COG_bin_{t+1}) β β
β β h_t β Linear β softmax β P(SOG_bin_{t+1}) β β
β β h_t β Linear β softmax β P(ROT_bin_{t+1}) β β
β β h_t β Linear β softmax β P(alt_rate_bin_{t+1}) β β
β β h_t β Linear β softmax β P(alt_band_{t+1}) β β
β β β β
β β Loss = Ξ£ CrossEntropy(predicted_feature, true_feature) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DOWNSTREAM: Activity Classification Head β β
β β (attached after pretraining, frozen or fine-tuned) β β
β β β β
β β h_[BOS] or mean(h_1:T) β MLP β softmax β class label β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
3. The Four Embedding Types (Detailed)
3.1 Geohash Embeddings β Spatial Position Encoding
Purpose: Encode the aircraft's 3D geographic position as a discrete token.
Method: We use H3 hexagonal hierarchical spatial index (Uber's H3) at resolution 5 (hex area β 252 kmΒ², edge β 9.85 km) for en-route flight, with an option to use resolution 7 (β 5.16 kmΒ², edge β 1.22 km) for terminal areas. This follows the H3-CLM paper's approach but adapted for aviation's larger spatial scale.
3D Extension: Since aircraft operate in 3D, we combine the H3 cell with an altitude band:
Geohash Token = H3_cell_index Γ N_alt_bands + alt_band_index
Altitude bands (1000 ft increments):
Band 0: 0 - 1,000 ft (ground / taxi)
Band 1: 1,000 - 2,000 ft (initial climb / approach)
...
Band 45: 44,000 - 45,000 ft (high cruise)
N_alt_bands = 46
Vocabulary size: At H3 resolution 5, the number of unique cells covering typical airspace is 100K-200K. With altitude bands: `200K Γ 46 β 9.2M` β too large for direct embedding.
Solution β Factored Embedding:
E_geohash = E_h3[h3_cell_id] + E_alt[alt_band_id]
E_h3: learned embedding table, vocab = N_h3_cells (~200K or hashing trick to 50K)
E_alt: learned embedding table, vocab = 46
Both project to d_model dimensions.
The hashing trick: Map H3 cell indices through a hash function to a fixed vocabulary of ~50,000 buckets. This bounds memory while maintaining spatial discrimination.
Why H3 over traditional geohash: H3 hexagons have uniform area (no polar distortion), hierarchical nesting, and consistent neighbor relationships β critical for trajectory continuity.
3.2 Temporal Embeddings β When Is the Aircraft Flying?
Purpose: Encode temporal context β time of day affects traffic density, routes, and behavior.
Method: Additive composition of multiple temporal scales:
E_temporal = E_hour[hour_of_day] + E_dow[day_of_week] + E_month[month]
E_hour: 24 entries (captures rush hour vs. night patterns)
E_dow: 7 entries (weekday vs. weekend traffic)
E_month: 12 entries (seasonal routes, weather patterns)
All project to d_model dimensions.
Optional β Sinusoidal Sub-minute Encoding: For sub-minute resolution:
E_minute = sin(2Ο Γ minute / 60), cos(2Ο Γ minute / 60) β linear β d_model
3.3 Uncertainty Embeddings β How Confident Are We?
Purpose: Encode the model's uncertainty about the current trajectory state. Aircraft in straight-and-level cruise have low uncertainty; aircraft maneuvering near airports have high uncertainty.
Method: Compute a trajectory smoothness score from recent states, then discretize:
Uncertainty sources (sliding window of k=5 recent states):
1. Position variance: ΟΒ²_pos = var(Ξlat) + var(Ξlon)
2. Heading variance: ΟΒ²_COG = circular_var(COG_{t-k:t})
3. Speed variance: ΟΒ²_SOG = var(SOG_{t-k:t})
4. Altitude variance: ΟΒ²_alt = var(alt_rate_{t-k:t})
Combined uncertainty score:
U_t = w1Β·ΟΒ²_pos + w2Β·ΟΒ²_COG + w3Β·ΟΒ²_SOG + w4Β·ΟΒ²_alt
Discretize into N_uncert = 16 bins (quantile binning on training data)
E_uncertainty = E_uncert_table[bin(U_t)] β d_model
Weights w1-w4: Hyperparameters tuned on validation data, or learned as part of the model.
During inference: For multi-step prediction, uncertainty can be updated using MC-Dropout or ensemble disagreement.
3.4 Prompt Embeddings β Task and Context Metadata
Purpose: Provide metadata context about the flight, analogous to system prompts in LLMs. Enables task conditioning and multi-task learning.
Method: Learnable prompt tokens prepended to the trajectory:
Prompt token vocabulary:
- Aircraft category: [HEAVY, LARGE, SMALL, ROTORCRAFT, GLIDER, UAV, UNKNOWN] (7)
- Flight phase: [CLIMB, CRUISE, DESCENT, APPROACH, GROUND, UNKNOWN] (6)
- Region: [CONUS, EUROPE, ASIA, OTHER] (4)
- Task: [PREDICT, CLASSIFY, DETECT_ANOMALY] (3)
- Special: [BOS, EOS, PAD, MASK] (4)
Total prompt vocab: ~24 tokens
Prompt sequence (prepended):
[BOS, TASK_TOKEN, AIRCRAFT_TOKEN, PHASE_TOKEN, REGION_TOKEN]
Each has a learned embedding of dimension d_model.
For downstream classification: Change TASK_TOKEN to CLASSIFY; output at BOS position is used for classification.
4. Feature Derivation Pipeline
4.1 Raw Input
timestamp (Unix epoch seconds)
latitude (degrees, WGS84)
longitude (degrees, WGS84)
altitude (feet, barometric or geometric)
4.2 Derived Features
import numpy as np
def derive_features(timestamps, lats, lons, alts):
"""
Derive COG, SOG, ROT, and altitude rate from raw position data.
All inputs: numpy arrays of shape (N,) for a single trajectory.
Returns arrays of shape (N,) β first element is NaN.
"""
dt = np.diff(timestamps) # seconds
dt = np.maximum(dt, 1e-6) # avoid division by zero
# --- Course Over Ground (COG) ---
lat1, lat2 = np.radians(lats[:-1]), np.radians(lats[1:])
dlon = np.radians(np.diff(lons))
x = np.sin(dlon) * np.cos(lat2)
y = np.cos(lat1) * np.sin(lat2) - np.sin(lat1) * np.cos(lat2) * np.cos(dlon)
COG = np.degrees(np.arctan2(x, y)) % 360 # [0, 360)
# --- Speed Over Ground (SOG) ---
dlat = np.radians(np.diff(lats))
a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
distance_nm = 3440.065 * c # Earth radius in nautical miles
SOG = distance_nm / (dt / 3600) # knots
# --- Rate of Turn (ROT) ---
dCOG = np.diff(COG)
dCOG = (dCOG + 180) % 360 - 180 # normalize to [-180, 180]
ROT = np.full(len(lats), np.nan)
ROT[2:] = dCOG / dt[1:] # degrees per second
# --- Rate of Altitude Change ---
dalt = np.diff(alts) # feet
alt_rate = dalt / (dt / 60) # feet per minute
# Pad first elements
COG_full = np.concatenate([[np.nan], COG])
SOG_full = np.concatenate([[np.nan], SOG])
alt_rate_full = np.concatenate([[np.nan], alt_rate])
return COG_full, SOG_full, ROT, alt_rate_full
4.3 Feature Discretization
| Feature | Range | Bin Width | N_bins | Notes |
|---|---|---|---|---|
| COG | [0, 360) | 5Β° | 72 | Circular |
| SOG | [0, 600] kts | 5 knots | 121 | Capped at ~Mach 1 |
| ROT | [-6, 6] Β°/s | 0.25 Β°/s | 49 | Capped Β±6Β°/s |
| Altitude Rate | [-6000, 6000] fpm | 200 ft/min | 61 | Capped Β±6000 fpm |
Outliers beyond caps clipped to boundary bin.
4.4 Trajectory Preprocessing Pipeline
1. Segment raw ADS-B by ICAO24 + temporal gaps > 15 min β individual flights
2. Resample to fixed Ξt = 60 seconds (linear interp for position, circular for heading)
3. Derive features (COG, SOG, ROT, alt_rate)
4. Drop first 2 points per trajectory (NaN from derivation)
5. Filter: remove trajectories with < 20 points (< 20 minutes)
6. Compute H3 cell (res 5) + altitude band for each point
7. Discretize all continuous features into bins
8. Compute uncertainty scores (sliding window k=5)
9. Extract temporal features (hour, dow, month)
10. Construct prompt tokens from metadata (if available)
5. Model Hyperparameters
5.1 Model Dimensions
| Parameter | Value | Rationale |
|---|---|---|
| d_model | 256 | H3-CLM found 256-1024 effective |
| n_heads | 8 | head_dim = 32 |
| n_layers | 8 | Moderate depth for ~10M param model |
| d_ff | 1024 | 4Γ d_model (standard) |
| max_seq_len | 128 | 128 states Γ 60s β 2 hours of flight |
| n_prompt_tokens | 5 | [BOS, TASK, AIRCRAFT, PHASE, REGION] |
| dropout | 0.1 |
Total parameters: ~8-12M (trainable on single GPU in hours)
5.2 Vocabulary Sizes
| Embedding | Vocab | Dim |
|---|---|---|
| H3 cells | 50,000 | 256 |
| Altitude bands | 46 | 256 |
| COG bins | 72 | 256 |
| SOG bins | 121 | 256 |
| ROT bins | 49 | 256 |
| Alt rate bins | 61 | 256 |
| Hour of day | 24 | 256 |
| Day of week | 7 | 256 |
| Month | 12 | 256 |
| Uncertainty bins | 16 | 256 |
| Prompt tokens | 24 | 256 |
5.3 State Token Composition
Each timestep β single state token via additive fusion:
E_state_t = E_h3[h3_id_t] + E_alt_band[alt_band_t] # Geohash (3D position)
+ E_COG[cog_bin_t] + E_SOG[sog_bin_t] # Kinematics
+ E_ROT[rot_bin_t] + E_alt_rate[alt_rate_bin_t] # Dynamics
+ E_hour[hour_t] + E_dow[dow_t] + E_month[month_t] # Temporal
+ E_uncert[uncert_bin_t] # Uncertainty
E_state_t β R^{d_model}
This additive fusion follows BERT (token + segment + position) and TrAISFormer.
6. Training Recipe
6.1 Pretraining: Next-State Prediction (Causal LM)
Objective: Given states 1..T, predict state at T+1 (applied autoregressively at every position).
Loss:
L = Ξ£_{t=1}^{T-1} [ Ξ»_geo Β· CE(Ε·_geo_t, y_geo_{t+1})
+ Ξ»_COG Β· CE(Ε·_COG_t, y_COG_{t+1})
+ Ξ»_SOG Β· CE(Ε·_SOG_t, y_SOG_{t+1})
+ Ξ»_ROT Β· CE(Ε·_ROT_t, y_ROT_{t+1})
+ Ξ»_alt Β· CE(Ε·_alt_rate_t, y_alt_rate_{t+1})
+ Ξ»_altb Β· CE(Ε·_alt_band_t, y_alt_band_{t+1}) ]
Ξ» values default to 1.0 (equal weighting).
Training hyperparameters (based on FTP-LLM + H3-CLM):
| Parameter | Value |
|---|---|
| Optimizer | AdamW |
| Learning rate | 5e-4 |
| LR Schedule | Cosine + 5% warmup |
| Batch size (per GPU) | 64 |
| Gradient accumulation | 4 (effective = 256) |
| Max epochs | 30 (early stop p=5) |
| Weight decay | 0.01 |
| Gradient clipping | 1.0 |
| Mixed precision | bf16 |
Data windowing: Sliding window size=128, stride=64 (50% overlap).
6.2 Downstream: Activity Classification
After pretraining, attach classification head:
h_BOS β Linear(256, 128) β GELU β Dropout(0.1) β Linear(128, N_classes)
Fine-tuning options:
- A: Freeze backbone, train head only (fast, small data)
- B: Full fine-tune, backbone lr=1e-5, head lr=1e-3
7. Dataset Strategy
7.1 Prototyping β traffic Python Library
from traffic.data.samples import landing_zurich_2019
# ~2,000 flights near Zurich
# Columns: timestamp, icao24, callsign, latitude, longitude, altitude,
# groundspeed, track, vertical_rate, ...
Instant access, clean, well-documented. Single airport, limited diversity.
7.2 Training β OpenSky Network
from pyopensky.trino import Trino
trino = Trino()
df = trino.rawquery("""
SELECT time, icao24, lat, lon, baroaltitude, velocity, heading, vertrate
FROM state_vectors_data4
WHERE hour >= '2024-01-15 00:00:00'
AND hour < '2024-01-15 12:00:00'
AND lat BETWEEN 40 AND 55
AND lon BETWEEN -10 AND 20
ORDER BY icao24, time
""")
Target:
- Region A (train): Europe, 1 month β ~500K-1M flights
- Region B (OOD test): US CONUS, 1 week β ~200K flights
- Region C (far test): East Asia, 1 week β ~100K flights
7.3 Alternative: SCAT Dataset
~170K en-route flights over Sweden, Zenodo. Pre-segmented, clean.
7.4 Data Split
Training: 70% of Region A flights
Validation: 15% of Region A flights
Test (IID): 15% of Region A flights
Test (OOD): 100% of Region B flights
Test (Far): 100% of Region C flights
Split by flight (not time window) to avoid data leakage.
8. Ablation Study: Geohash Geographic Dependency
8.1 Hypothesis
Geohash embeddings encode absolute geographic position, causing the model to memorize region-specific patterns (airways, approach paths, airspace structure). This improves in-distribution performance but degrades transfer to unseen regions.
8.2 Experimental Variants
| Variant | Geohash Type | Description |
|---|---|---|
| V1: Full Model | H3 absolute | Complete architecture as described |
| V2: No Geohash | None | Remove geohash entirely; model sees only kinematics + temporal + uncertainty |
| V3: Relative Geohash | H3 relative | H3 cell of (Ξlat, Ξlon) from trajectory start β position-invariant |
| V4: Multi-Resolution | H3 res 3+5+7 | 3 resolutions summed (coarseβfine) |
| V5: Continuous Position | Linear projection | Linear([lat, lon, alt] β d_model) β no discretization |
8.3 Evaluation Metrics
For each variant Γ each test set (IID, OOD, Far):
| Metric | Description |
|---|---|
| Geo Accuracy | % correct H3 cell prediction |
| Position MAE | Mean absolute error in km |
| COG MAE | Heading error in degrees |
| SOG MAE | Speed error in knots |
| Multi-step ADE | Average displacement error over 5 predicted steps |
| Multi-step FDE | Final displacement error at step 5 |
8.4 Key Comparisons
| Comparison | Tests |
|---|---|
| V1 vs V2 (IID) | How much geohash helps when test = train region |
| V1 vs V2 (OOD) | If V2 > V1 on OOD β geohash causes geographic overfitting |
| V1 vs V3 (OOD) | If V3 good on both IID and OOD β relative geohash is the sweet spot |
| V4 (all) | Multi-resolution: coarse cells transfer, fine cells specialize? |
| V5 (all) | Does continuous encoding avoid discretization issues? |
8.5 Expected Outcomes
- V1: Best IID, worst OOD (hypothesis)
- V3: Best compromise β predicted winner
- V5: May struggle (loses discrete token structure transformers excel at)
- V2: Strong OOD baseline, sacrifices IID
8.6 Additional Analysis
- Attention visualization: V1 vs V3 attention patterns
- Embedding clustering: t-SNE of geohash embeddings colored by region
- Learning curves: IID vs OOD performance vs training data size
9. Implementation Phases
Phase 1: Data Pipeline (Week 1)
- Set up
trafficlibrary, extract sample trajectories - Implement feature derivation (COG, SOG, ROT, alt_rate)
- Implement H3 geohash encoding + altitude banding
- Implement feature discretization (binning)
- Implement uncertainty score computation
- Build PyTorch Dataset class with sliding window
- Unit tests for all derivation functions
Phase 2: Model Architecture (Week 1-2)
- Implement all embedding tables
- Implement additive fusion layer
- Implement prompt token prepending
- Implement decoder-only transformer backbone
- Implement multi-head output (6 prediction heads)
- Implement classification head (for downstream)
- Forward pass test with dummy data
Phase 3: Pretraining (Week 2-3)
- Implement training loop with multi-task loss
- Prototyping run on
trafficdata (small, fast iteration) - Scale to OpenSky data
- Monitor loss curves, validate convergence
- Save best checkpoint
Phase 4: Downstream Adaptation (Week 3-4)
- Implement classification fine-tuning pipeline
- Test on activity classification task
- Compare frozen vs. fine-tuned backbone
Phase 5: Ablation Study (Week 4-5)
- Implement all 5 geohash variants
- Train each variant with identical hyperparameters
- Evaluate on IID, OOD, and Far test sets
- Generate comparison tables and visualizations
- Write analysis of geographic dependency findings
10. Key Design Decisions & Rationale
| Decision | Choice | Why |
|---|---|---|
| Custom model vs. pretrained LLM | Custom ~10M param transformer | FTP-LLM showed text-tokenized LLMs work, but custom allows proper multi-feature fusion. 10M params trains in hours. |
| H3 vs. traditional geohash | H3 | Uniform hexagonal cells, no polar distortion, hierarchical. Proven by H3-CLM. |
| Additive vs. concatenative fusion | Additive | BERT/TrAISFormer paradigm. Keeps d_model constant. Concatenation β d_model Γ N_features = massive. |
| 60s time resolution | 60 seconds | FTP-LLM validated 1-min aggregation. 128 steps β 2+ hours. |
| Factored geohash (H3 + alt) | Separate tables, summed | Avoids combinatorial explosion (9.2M β 50K + 46). |
| Multi-head output | Separate softmax per feature | More interpretable, allows per-feature analysis. |
| Uncertainty from smoothness | Variance-based | Computable at data time, no inference overhead. |
11. Risk Analysis
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Geohash overfits to region | High | High | Ablation study; V3 (relative) is fallback |
| OpenSky access issues | Medium | High | Fallback: traffic samples + SCAT |
| 60s too coarse for terminal | Medium | Low | Separate terminal model at 10s |
| Model too small | Low | Medium | Scale: d_modelβ512, n_layersβ16 (~40M) |
| Alt discretization too coarse | Low | Low | Refine to 500ft bands (92) |
12. Monitoring & Evaluation
During training (Trackio):
- Total loss + per-feature loss curves
- Validation loss each epoch
- LR schedule, GPU utilization
After training:
- Next-state accuracy (top-1, top-5 per feature)
- Position error in km
- Multi-step prediction (1, 5, 10, 20 steps ahead)
- Downstream classification F1/precision/recall
Grounded in: FTP-LLM, H3-CLM, GeoFormer, TrAISFormer, and LLM4STP (reconstructed). Ready for implementation upon approval.