File size: 33,143 Bytes
e43dca4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 | # AirTrackLM: LLM4STP Adapted for ADS-B Air Track Prediction
## Complete Architecture & Implementation Plan
---
## 1. Executive Summary
We adapt the LLM4STP multi-feature fusion architecture (originally for maritime AIS ship trajectory prediction) to work with **ADS-B air track data**. The model uses a **decoder-only transformer** with four specialized embedding types β Prompt, Uncertainty, Geohash, and Temporal β fused together for **next-state prediction** pretraining. Once pretrained, the model is adaptable to downstream tasks like activity classification.
This design is grounded in published results from:
- **FTP-LLM** (arXiv:2501.17459) β LLaMA-3.1-8B for flight trajectory prediction
- **H3-CLM** (arXiv:2405.09596) β H3 geohash + causal LM for maritime trajectories
- **GeoFormer** (arXiv:2311.05092) β GPT-style geospatial tokenization
- **TrAISFormer** (arXiv:2109.03958) β Discrete tokenization of AIS features
---
## 2. System Architecture Overview
```
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β RAW ADS-B INPUT β
β (timestamp, latitude, longitude, altitude) β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FEATURE DERIVATION PIPELINE β
β β
β Raw: lat, lon, alt β
β Derived: COG, SOG, ROT, altitude_rate β
β Meta: timestamp β (hour, day_of_week, month) β
β β
β Output per timestep: β
β state_t = [lat, lon, alt, COG, SOG, ROT, alt_rate] β
βββββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β TOKENIZATION / ENCODING β
β β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Geohash β β Continuous β β Temporal β β
β β Tokenizer β β Discretizer β β Encoder β β
β β β β β β β β
β β lat,lon,alt β β COG,SOG,ROT β β hour,dow, β β
β β β H3 cell + β β alt_rate β β month β β
β β alt_band β β β bin IDs β β β time IDs β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
β βΌ βΌ βΌ β
β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ β
β β Geohash β β Feature β β Temporal β β
β β Embedding β β Embeddings β β Embedding β β
β β Table β β Tables β β Table β β
β β (d_model) β β (d_model) β β (d_model) β β
β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββ¬ββββββββ β
β β β β β
ββββββββββββΌββββββββββββββββββΌββββββββββββββββββΌβββββββββββββββββββββββ
β β β
βΌ βΌ βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β EMBEDDING FUSION LAYER β
β β
β ββββββββββββββ ββββββββββββββ ββββββββββββββ ββββββββββββββββ β
β β Geohash β β Feature β β Temporal β β Uncertainty β β
β β Embed β β Embed β β Embed β β Embed β β
β β (d_model) β β (d_model) β β (d_model) β β (d_model) β β
β βββββββ¬βββββββ βββββββ¬βββββββ βββββββ¬βββββββ ββββββββ¬ββββββββ β
β β β β β β
β ββββββββββββ¬ββββ΄βββββββ¬ββββββββ β β
β β β β β
β βΌ βΌ βΌ β
β E_state = E_geo + E_feat + E_temp + E_uncert β
β β β
β βΌ β
β βββββββββββββββββββββββββββββββββββββββββββββ β
β β Prompt Embedding (prepended prefix) β β
β β [PROMPT_1, PROMPT_2, ..., PROMPT_k] β β
β βββββββββββββββββββββ¬ββββββββββββββββββββββββ β
β β β
β βΌ β
β Input: [PROMPT_TOKENS | STATE_1 | STATE_2 | ... | STATE_T] β
β β β
β βΌ β
β Linear Projection β d_model β
β β β
β βΌ β
β + Positional Encoding (sinusoidal) β
β β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β DECODER-ONLY TRANSFORMER BACKBONE β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Transformer Block ΓN_layers β β
β β β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β Causal Multi-Head Self-Attention β β β
β β β (masked: each position attends only β β β
β β β to itself and earlier positions) β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β LayerNorm + Residual Connection β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β Feed-Forward Network β β β
β β β (Linear β GELU β Linear) β β β
β β β d_model β 4*d_model β d_model β β β
β β ββββββββββββββββββββ¬βββββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β LayerNorm + Residual Connection β β β
β β βββββββββββββββββββββββββββββββββββββββββββ β β
β β β β
β βββββββββββββββββββββββββββββββββββββββββββββββ β β
β β
βββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β OUTPUT HEADS β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β PRETRAINING: Next-State Prediction Head β β
β β β β
β β For each position t, predict state at t+1: β β
β β β β
β β h_t β Linear β softmax β P(geohash_token_{t+1}) β β
β β h_t β Linear β softmax β P(COG_bin_{t+1}) β β
β β h_t β Linear β softmax β P(SOG_bin_{t+1}) β β
β β h_t β Linear β softmax β P(ROT_bin_{t+1}) β β
β β h_t β Linear β softmax β P(alt_rate_bin_{t+1}) β β
β β h_t β Linear β softmax β P(alt_band_{t+1}) β β
β β β β
β β Loss = Ξ£ CrossEntropy(predicted_feature, true_feature) β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β DOWNSTREAM: Activity Classification Head β β
β β (attached after pretraining, frozen or fine-tuned) β β
β β β β
β β h_[BOS] or mean(h_1:T) β MLP β softmax β class label β β
β βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## 3. The Four Embedding Types (Detailed)
### 3.1 Geohash Embeddings β Spatial Position Encoding
**Purpose**: Encode the aircraft's 3D geographic position as a discrete token.
**Method**: We use **H3 hexagonal hierarchical spatial index** (Uber's H3) at resolution 5 (hex area β 252 kmΒ², edge β 9.85 km) for en-route flight, with an option to use resolution 7 (β 5.16 kmΒ², edge β 1.22 km) for terminal areas. This follows the H3-CLM paper's approach but adapted for aviation's larger spatial scale.
**3D Extension**: Since aircraft operate in 3D, we combine the H3 cell with an **altitude band**:
```
Geohash Token = H3_cell_index Γ N_alt_bands + alt_band_index
Altitude bands (1000 ft increments):
Band 0: 0 - 1,000 ft (ground / taxi)
Band 1: 1,000 - 2,000 ft (initial climb / approach)
...
Band 45: 44,000 - 45,000 ft (high cruise)
N_alt_bands = 46
```
**Vocabulary size**: At H3 resolution 5, the number of unique cells covering typical airspace is ~100K-200K. With altitude bands: `~200K Γ 46 β 9.2M` β too large for direct embedding.
**Solution β Factored Embedding**:
```
E_geohash = E_h3[h3_cell_id] + E_alt[alt_band_id]
E_h3: learned embedding table, vocab = N_h3_cells (~200K or hashing trick to 50K)
E_alt: learned embedding table, vocab = 46
Both project to d_model dimensions.
```
The **hashing trick**: Map H3 cell indices through a hash function to a fixed vocabulary of ~50,000 buckets. This bounds memory while maintaining spatial discrimination.
**Why H3 over traditional geohash**: H3 hexagons have uniform area (no polar distortion), hierarchical nesting, and consistent neighbor relationships β critical for trajectory continuity.
### 3.2 Temporal Embeddings β When Is the Aircraft Flying?
**Purpose**: Encode temporal context β time of day affects traffic density, routes, and behavior.
**Method**: Additive composition of multiple temporal scales:
```
E_temporal = E_hour[hour_of_day] + E_dow[day_of_week] + E_month[month]
E_hour: 24 entries (captures rush hour vs. night patterns)
E_dow: 7 entries (weekday vs. weekend traffic)
E_month: 12 entries (seasonal routes, weather patterns)
All project to d_model dimensions.
```
**Optional β Sinusoidal Sub-minute Encoding**: For sub-minute resolution:
```
E_minute = sin(2Ο Γ minute / 60), cos(2Ο Γ minute / 60) β linear β d_model
```
### 3.3 Uncertainty Embeddings β How Confident Are We?
**Purpose**: Encode the model's uncertainty about the current trajectory state. Aircraft in straight-and-level cruise have low uncertainty; aircraft maneuvering near airports have high uncertainty.
**Method**: Compute a **trajectory smoothness score** from recent states, then discretize:
```
Uncertainty sources (sliding window of k=5 recent states):
1. Position variance: ΟΒ²_pos = var(Ξlat) + var(Ξlon)
2. Heading variance: ΟΒ²_COG = circular_var(COG_{t-k:t})
3. Speed variance: ΟΒ²_SOG = var(SOG_{t-k:t})
4. Altitude variance: ΟΒ²_alt = var(alt_rate_{t-k:t})
Combined uncertainty score:
U_t = w1Β·ΟΒ²_pos + w2Β·ΟΒ²_COG + w3Β·ΟΒ²_SOG + w4Β·ΟΒ²_alt
Discretize into N_uncert = 16 bins (quantile binning on training data)
E_uncertainty = E_uncert_table[bin(U_t)] β d_model
```
**Weights w1-w4**: Hyperparameters tuned on validation data, or learned as part of the model.
**During inference**: For multi-step prediction, uncertainty can be updated using MC-Dropout or ensemble disagreement.
### 3.4 Prompt Embeddings β Task and Context Metadata
**Purpose**: Provide metadata context about the flight, analogous to system prompts in LLMs. Enables task conditioning and multi-task learning.
**Method**: Learnable prompt tokens prepended to the trajectory:
```
Prompt token vocabulary:
- Aircraft category: [HEAVY, LARGE, SMALL, ROTORCRAFT, GLIDER, UAV, UNKNOWN] (7)
- Flight phase: [CLIMB, CRUISE, DESCENT, APPROACH, GROUND, UNKNOWN] (6)
- Region: [CONUS, EUROPE, ASIA, OTHER] (4)
- Task: [PREDICT, CLASSIFY, DETECT_ANOMALY] (3)
- Special: [BOS, EOS, PAD, MASK] (4)
Total prompt vocab: ~24 tokens
Prompt sequence (prepended):
[BOS, TASK_TOKEN, AIRCRAFT_TOKEN, PHASE_TOKEN, REGION_TOKEN]
Each has a learned embedding of dimension d_model.
```
**For downstream classification**: Change TASK_TOKEN to CLASSIFY; output at BOS position is used for classification.
---
## 4. Feature Derivation Pipeline
### 4.1 Raw Input
```
timestamp (Unix epoch seconds)
latitude (degrees, WGS84)
longitude (degrees, WGS84)
altitude (feet, barometric or geometric)
```
### 4.2 Derived Features
```python
import numpy as np
def derive_features(timestamps, lats, lons, alts):
"""
Derive COG, SOG, ROT, and altitude rate from raw position data.
All inputs: numpy arrays of shape (N,) for a single trajectory.
Returns arrays of shape (N,) β first element is NaN.
"""
dt = np.diff(timestamps) # seconds
dt = np.maximum(dt, 1e-6) # avoid division by zero
# --- Course Over Ground (COG) ---
lat1, lat2 = np.radians(lats[:-1]), np.radians(lats[1:])
dlon = np.radians(np.diff(lons))
x = np.sin(dlon) * np.cos(lat2)
y = np.cos(lat1) * np.sin(lat2) - np.sin(lat1) * np.cos(lat2) * np.cos(dlon)
COG = np.degrees(np.arctan2(x, y)) % 360 # [0, 360)
# --- Speed Over Ground (SOG) ---
dlat = np.radians(np.diff(lats))
a = np.sin(dlat/2)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2)**2
c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
distance_nm = 3440.065 * c # Earth radius in nautical miles
SOG = distance_nm / (dt / 3600) # knots
# --- Rate of Turn (ROT) ---
dCOG = np.diff(COG)
dCOG = (dCOG + 180) % 360 - 180 # normalize to [-180, 180]
ROT = np.full(len(lats), np.nan)
ROT[2:] = dCOG / dt[1:] # degrees per second
# --- Rate of Altitude Change ---
dalt = np.diff(alts) # feet
alt_rate = dalt / (dt / 60) # feet per minute
# Pad first elements
COG_full = np.concatenate([[np.nan], COG])
SOG_full = np.concatenate([[np.nan], SOG])
alt_rate_full = np.concatenate([[np.nan], alt_rate])
return COG_full, SOG_full, ROT, alt_rate_full
```
### 4.3 Feature Discretization
| Feature | Range | Bin Width | N_bins | Notes |
|---------------|-------------------|--------------|--------|--------------------|
| COG | [0, 360) | 5Β° | 72 | Circular |
| SOG | [0, 600] kts | 5 knots | 121 | Capped at ~Mach 1 |
| ROT | [-6, 6] Β°/s | 0.25 Β°/s | 49 | Capped Β±6Β°/s |
| Altitude Rate | [-6000, 6000] fpm | 200 ft/min | 61 | Capped Β±6000 fpm |
Outliers beyond caps clipped to boundary bin.
### 4.4 Trajectory Preprocessing Pipeline
```
1. Segment raw ADS-B by ICAO24 + temporal gaps > 15 min β individual flights
2. Resample to fixed Ξt = 60 seconds (linear interp for position, circular for heading)
3. Derive features (COG, SOG, ROT, alt_rate)
4. Drop first 2 points per trajectory (NaN from derivation)
5. Filter: remove trajectories with < 20 points (< 20 minutes)
6. Compute H3 cell (res 5) + altitude band for each point
7. Discretize all continuous features into bins
8. Compute uncertainty scores (sliding window k=5)
9. Extract temporal features (hour, dow, month)
10. Construct prompt tokens from metadata (if available)
```
---
## 5. Model Hyperparameters
### 5.1 Model Dimensions
| Parameter | Value | Rationale |
|------------------|--------|----------------------------------------------------|
| d_model | 256 | H3-CLM found 256-1024 effective |
| n_heads | 8 | head_dim = 32 |
| n_layers | 8 | Moderate depth for ~10M param model |
| d_ff | 1024 | 4Γ d_model (standard) |
| max_seq_len | 128 | 128 states Γ 60s β 2 hours of flight |
| n_prompt_tokens | 5 | [BOS, TASK, AIRCRAFT, PHASE, REGION] |
| dropout | 0.1 | |
**Total parameters**: ~8-12M (trainable on single GPU in hours)
### 5.2 Vocabulary Sizes
| Embedding | Vocab | Dim |
|------------------|--------|-----|
| H3 cells | 50,000 | 256 |
| Altitude bands | 46 | 256 |
| COG bins | 72 | 256 |
| SOG bins | 121 | 256 |
| ROT bins | 49 | 256 |
| Alt rate bins | 61 | 256 |
| Hour of day | 24 | 256 |
| Day of week | 7 | 256 |
| Month | 12 | 256 |
| Uncertainty bins | 16 | 256 |
| Prompt tokens | 24 | 256 |
### 5.3 State Token Composition
Each timestep β single state token via additive fusion:
```
E_state_t = E_h3[h3_id_t] + E_alt_band[alt_band_t] # Geohash (3D position)
+ E_COG[cog_bin_t] + E_SOG[sog_bin_t] # Kinematics
+ E_ROT[rot_bin_t] + E_alt_rate[alt_rate_bin_t] # Dynamics
+ E_hour[hour_t] + E_dow[dow_t] + E_month[month_t] # Temporal
+ E_uncert[uncert_bin_t] # Uncertainty
E_state_t β R^{d_model}
```
This additive fusion follows BERT (token + segment + position) and TrAISFormer.
---
## 6. Training Recipe
### 6.1 Pretraining: Next-State Prediction (Causal LM)
**Objective**: Given states 1..T, predict state at T+1 (applied autoregressively at every position).
**Loss**:
```
L = Ξ£_{t=1}^{T-1} [ Ξ»_geo Β· CE(Ε·_geo_t, y_geo_{t+1})
+ Ξ»_COG Β· CE(Ε·_COG_t, y_COG_{t+1})
+ Ξ»_SOG Β· CE(Ε·_SOG_t, y_SOG_{t+1})
+ Ξ»_ROT Β· CE(Ε·_ROT_t, y_ROT_{t+1})
+ Ξ»_alt Β· CE(Ε·_alt_rate_t, y_alt_rate_{t+1})
+ Ξ»_altb Β· CE(Ε·_alt_band_t, y_alt_band_{t+1}) ]
Ξ» values default to 1.0 (equal weighting).
```
**Training hyperparameters** (based on FTP-LLM + H3-CLM):
| Parameter | Value |
|----------------------|---------------------|
| Optimizer | AdamW |
| Learning rate | 5e-4 |
| LR Schedule | Cosine + 5% warmup |
| Batch size (per GPU) | 64 |
| Gradient accumulation| 4 (effective = 256) |
| Max epochs | 30 (early stop p=5) |
| Weight decay | 0.01 |
| Gradient clipping | 1.0 |
| Mixed precision | bf16 |
**Data windowing**: Sliding window size=128, stride=64 (50% overlap).
### 6.2 Downstream: Activity Classification
After pretraining, attach classification head:
```
h_BOS β Linear(256, 128) β GELU β Dropout(0.1) β Linear(128, N_classes)
```
**Fine-tuning options**:
- **A**: Freeze backbone, train head only (fast, small data)
- **B**: Full fine-tune, backbone lr=1e-5, head lr=1e-3
---
## 7. Dataset Strategy
### 7.1 Prototyping β `traffic` Python Library
```python
from traffic.data.samples import landing_zurich_2019
# ~2,000 flights near Zurich
# Columns: timestamp, icao24, callsign, latitude, longitude, altitude,
# groundspeed, track, vertical_rate, ...
```
Instant access, clean, well-documented. Single airport, limited diversity.
### 7.2 Training β OpenSky Network
```python
from pyopensky.trino import Trino
trino = Trino()
df = trino.rawquery("""
SELECT time, icao24, lat, lon, baroaltitude, velocity, heading, vertrate
FROM state_vectors_data4
WHERE hour >= '2024-01-15 00:00:00'
AND hour < '2024-01-15 12:00:00'
AND lat BETWEEN 40 AND 55
AND lon BETWEEN -10 AND 20
ORDER BY icao24, time
""")
```
**Target**:
- **Region A** (train): Europe, 1 month β ~500K-1M flights
- **Region B** (OOD test): US CONUS, 1 week β ~200K flights
- **Region C** (far test): East Asia, 1 week β ~100K flights
### 7.3 Alternative: SCAT Dataset
~170K en-route flights over Sweden, Zenodo. Pre-segmented, clean.
### 7.4 Data Split
```
Training: 70% of Region A flights
Validation: 15% of Region A flights
Test (IID): 15% of Region A flights
Test (OOD): 100% of Region B flights
Test (Far): 100% of Region C flights
```
Split by **flight** (not time window) to avoid data leakage.
---
## 8. Ablation Study: Geohash Geographic Dependency
### 8.1 Hypothesis
> Geohash embeddings encode **absolute geographic position**, causing the model to memorize region-specific patterns (airways, approach paths, airspace structure). This improves in-distribution performance but degrades transfer to unseen regions.
### 8.2 Experimental Variants
| Variant | Geohash Type | Description |
|---------|-------------|-------------|
| **V1: Full Model** | H3 absolute | Complete architecture as described |
| **V2: No Geohash** | None | Remove geohash entirely; model sees only kinematics + temporal + uncertainty |
| **V3: Relative Geohash** | H3 relative | H3 cell of (Ξlat, Ξlon) from trajectory start β position-invariant |
| **V4: Multi-Resolution** | H3 res 3+5+7 | 3 resolutions summed (coarseβfine) |
| **V5: Continuous Position** | Linear projection | `Linear([lat, lon, alt] β d_model)` β no discretization |
### 8.3 Evaluation Metrics
For each variant Γ each test set (IID, OOD, Far):
| Metric | Description |
|--------|-------------|
| Geo Accuracy | % correct H3 cell prediction |
| Position MAE | Mean absolute error in km |
| COG MAE | Heading error in degrees |
| SOG MAE | Speed error in knots |
| Multi-step ADE | Average displacement error over 5 predicted steps |
| Multi-step FDE | Final displacement error at step 5 |
### 8.4 Key Comparisons
| Comparison | Tests |
|-----------|-------|
| V1 vs V2 (IID) | How much geohash helps when test = train region |
| V1 vs V2 (OOD) | If V2 > V1 on OOD β geohash causes geographic overfitting |
| V1 vs V3 (OOD) | If V3 good on both IID and OOD β relative geohash is the sweet spot |
| V4 (all) | Multi-resolution: coarse cells transfer, fine cells specialize? |
| V5 (all) | Does continuous encoding avoid discretization issues? |
### 8.5 Expected Outcomes
- **V1**: Best IID, worst OOD (hypothesis)
- **V3**: Best compromise β predicted winner
- **V5**: May struggle (loses discrete token structure transformers excel at)
- **V2**: Strong OOD baseline, sacrifices IID
### 8.6 Additional Analysis
- **Attention visualization**: V1 vs V3 attention patterns
- **Embedding clustering**: t-SNE of geohash embeddings colored by region
- **Learning curves**: IID vs OOD performance vs training data size
---
## 9. Implementation Phases
### Phase 1: Data Pipeline (Week 1)
- Set up `traffic` library, extract sample trajectories
- Implement feature derivation (COG, SOG, ROT, alt_rate)
- Implement H3 geohash encoding + altitude banding
- Implement feature discretization (binning)
- Implement uncertainty score computation
- Build PyTorch Dataset class with sliding window
- Unit tests for all derivation functions
### Phase 2: Model Architecture (Week 1-2)
- Implement all embedding tables
- Implement additive fusion layer
- Implement prompt token prepending
- Implement decoder-only transformer backbone
- Implement multi-head output (6 prediction heads)
- Implement classification head (for downstream)
- Forward pass test with dummy data
### Phase 3: Pretraining (Week 2-3)
- Implement training loop with multi-task loss
- Prototyping run on `traffic` data (small, fast iteration)
- Scale to OpenSky data
- Monitor loss curves, validate convergence
- Save best checkpoint
### Phase 4: Downstream Adaptation (Week 3-4)
- Implement classification fine-tuning pipeline
- Test on activity classification task
- Compare frozen vs. fine-tuned backbone
### Phase 5: Ablation Study (Week 4-5)
- Implement all 5 geohash variants
- Train each variant with identical hyperparameters
- Evaluate on IID, OOD, and Far test sets
- Generate comparison tables and visualizations
- Write analysis of geographic dependency findings
---
## 10. Key Design Decisions & Rationale
| Decision | Choice | Why |
|----------|--------|-----|
| Custom model vs. pretrained LLM | Custom ~10M param transformer | FTP-LLM showed text-tokenized LLMs work, but custom allows proper multi-feature fusion. 10M params trains in hours. |
| H3 vs. traditional geohash | H3 | Uniform hexagonal cells, no polar distortion, hierarchical. Proven by H3-CLM. |
| Additive vs. concatenative fusion | Additive | BERT/TrAISFormer paradigm. Keeps d_model constant. Concatenation β d_model Γ N_features = massive. |
| 60s time resolution | 60 seconds | FTP-LLM validated 1-min aggregation. 128 steps β 2+ hours. |
| Factored geohash (H3 + alt) | Separate tables, summed | Avoids combinatorial explosion (9.2M β 50K + 46). |
| Multi-head output | Separate softmax per feature | More interpretable, allows per-feature analysis. |
| Uncertainty from smoothness | Variance-based | Computable at data time, no inference overhead. |
---
## 11. Risk Analysis
| Risk | Likelihood | Impact | Mitigation |
|------|-----------|--------|------------|
| Geohash overfits to region | High | High | Ablation study; V3 (relative) is fallback |
| OpenSky access issues | Medium | High | Fallback: `traffic` samples + SCAT |
| 60s too coarse for terminal | Medium | Low | Separate terminal model at 10s |
| Model too small | Low | Medium | Scale: d_modelβ512, n_layersβ16 (~40M) |
| Alt discretization too coarse | Low | Low | Refine to 500ft bands (92) |
---
## 12. Monitoring & Evaluation
**During training** (Trackio):
- Total loss + per-feature loss curves
- Validation loss each epoch
- LR schedule, GPU utilization
**After training**:
- Next-state accuracy (top-1, top-5 per feature)
- Position error in km
- Multi-step prediction (1, 5, 10, 20 steps ahead)
- Downstream classification F1/precision/recall
---
*Grounded in: FTP-LLM, H3-CLM, GeoFormer, TrAISFormer, and LLM4STP (reconstructed). Ready for implementation upon approval.*
|