Jdice27 commited on
Commit
48b8bfe
Β·
verified Β·
1 Parent(s): 9ca1785

Add README with architecture overview

Browse files
Files changed (1) hide show
  1. README.md +47 -0
README.md ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # AirTrackLM
2
+
3
+ A decoder-only transformer for ADS-B air track next-state prediction, adapted from the LLM4STP architecture.
4
+
5
+ ## Architecture
6
+
7
+ - **Model**: Custom ~7M parameter decoder-only transformer
8
+ - **4 Embedding Types**: Geohash (40-bit binary, 3D), Kinematic Features (COG/SOG/ROT/AltRate), Temporal (sub-second sinusoidal), Uncertainty (4 methods + learned heteroscedastic)
9
+ - **Pretraining**: Next-state prediction (predict all features at t+1 from sequence up to t)
10
+ - **Coordinate System**: ENU (East-North-Up) with 3-point central derivative for velocity computation
11
+
12
+ ## Uncertainty Methods
13
+
14
+ 1. **Kinematic Variance** β€” Sliding-window variance of COG/SOG/ROT/alt_rate
15
+ 2. **Prediction Residual** β€” Deviation from constant-velocity prediction model
16
+ 3. **Spatial Density** β€” Data coverage proxy (fewer nearby training points = higher uncertainty)
17
+ 4. **Flight Phase Entropy** β€” Entropy of phase classification in a window (mixed phases = uncertain)
18
+ 5. **Learned Heteroscedastic** β€” Model predicts its own log-variance per output head (aleatoric)
19
+ 6. **MC-Dropout** β€” Monte Carlo dropout at inference for epistemic uncertainty
20
+
21
+ ## Features
22
+
23
+ - **Inputs**: Raw ADS-B (lat, lon, alt, timestamp)
24
+ - **Derived**: COG, SOG, ROT, altitude rate via 3-point central derivative on ENU positions
25
+ - **Geohash**: 40-bit binary encoding per axis (E, N, U) = 120-bit 3D position token
26
+ - **Temporal**: Sinusoidal second-of-day (sub-second resolution) + calendar embeddings + Ξ”t encoding
27
+ - **Output Heads**: Binary geohash prediction, continuous Ξ”-ENU regression, COG/SOG/ROT/AltRate bin classification
28
+
29
+ ## Data
30
+
31
+ Training data from the `traffic` Python library (real ADS-B surveillance data).
32
+
33
+ ## Files
34
+
35
+ - `model.py` β€” Full model architecture (AirTrackLM, embeddings, loss functions)
36
+ - `data_pipeline.py` β€” ENU conversion, 3-point derivatives, geohash encoding, dataset
37
+ - `uncertainty.py` β€” 6 uncertainty quantification methods
38
+ - `train.py` β€” Training utilities
39
+ - `train_full.py` β€” Full GPU training script with Hub push
40
+ - `ARCHITECTURE.md` β€” Detailed architecture document
41
+
42
+ ## Based On
43
+
44
+ - **LLM4STP** (Joker-hang/LLM4STP) β€” Binary geohash encoding, GPT-2 backbone concept
45
+ - **FTP-LLM** (arXiv:2501.17459) β€” LLM for flight trajectory prediction
46
+ - **H3-CLM** (arXiv:2405.09596) β€” Hexagonal geohash + causal LM for maritime trajectories
47
+ - **GeoFormer** (arXiv:2311.05092) β€” GPT-style geospatial tokenization