flashvenom commited on
Commit
69c6b58
·
0 Parent(s):

Add images and visual enhancements to README

Browse files
Files changed (9) hide show
  1. .gitattributes +4 -0
  2. README.md +283 -0
  3. config.json +81 -0
  4. lacuna-end.png +3 -0
  5. lacuna.png +3 -0
  6. model.safetensors +3 -0
  7. normalization_stats.npz +3 -0
  8. trades.csv +3 -0
  9. updates.csv +3 -0
.gitattributes ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ *.png filter=lfs diff=lfs merge=lfs -text
2
+ *.csv filter=lfs diff=lfs merge=lfs -text
3
+ *.npz filter=lfs diff=lfs merge=lfs -text
4
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,283 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - reinforcement-learning
5
+ - ppo
6
+ - trading
7
+ - prediction-markets
8
+ - polymarket
9
+ - crypto
10
+ - cross-market
11
+ - temporal-encoding
12
+ language:
13
+ - en
14
+ library_name: pytorch
15
+ pipeline_tag: reinforcement-learning
16
+ thumbnail: lacuna.png
17
+ ---
18
+
19
+ <div align="center">
20
+
21
+ ![LACUNA](lacuna.png)
22
+
23
+ # LACUNA
24
+
25
+ **Cross-Market Data Fusion for Prediction Market Trading**
26
+
27
+ [![Live Results](https://img.shields.io/badge/Live%20Results-humanplane.com%2Flacuna-blue)](https://humanplane.com/lacuna)
28
+ [![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
29
+ [![PyTorch](https://img.shields.io/badge/PyTorch-2.0+-red.svg)](https://pytorch.org/)
30
+
31
+ </div>
32
+
33
+ ---
34
+
35
+ An experiment in cross-market data fusion. A reinforcement learning agent trained to trade Polymarket's 15-minute crypto prediction markets by fusing Binance futures order flow with Polymarket orderbook data.
36
+
37
+ **The thesis**: read the "fast" market (Binance) and trade the "slow" market (Polymarket) before the price adjusts.
38
+
39
+ ## Results
40
+
41
+ | Metric | Value |
42
+ |--------|-------|
43
+ | Total PnL | $50,195 |
44
+ | Return on Exposure | 2,510% |
45
+ | Sharpe Ratio | 4.13 |
46
+ | Profit Factor | 1.21 |
47
+ | Total Trades | 29,176 |
48
+ | Win Rate | 23.9% |
49
+ | Runtime | ~10 hours |
50
+
51
+ ### Learning Progression
52
+
53
+ The model genuinely learned profitable strategies through reinforcement learning:
54
+
55
+ | Phase | Avg PnL/Trade | Win Rate |
56
+ |-------|---------------|----------|
57
+ | Early | +$1.29 | 23.6% |
58
+ | Late | +$2.15 | 24.2% |
59
+
60
+ **+0.6% win rate improvement** and **+$0.85 avg PnL improvement** per trade.
61
+
62
+ ### Performance by Asset
63
+
64
+ | Asset | PnL | Trades | Win Rate |
65
+ |-------|-----|--------|----------|
66
+ | BTC | +$38,794 | 8,257 | 32.6% |
67
+ | ETH | +$9,978 | 7,859 | 27.0% |
68
+ | SOL | +$1,752 | 6,310 | 16.3% |
69
+ | XRP | -$328 | 6,750 | 16.7% |
70
+
71
+ ## Architecture
72
+
73
+ LACUNA (v5) uses a temporal PPO architecture:
74
+
75
+ - **Temporal Encoder**: Sees last 5 states instead of just the present
76
+ - **Asymmetric Actor-Critic**: Separate networks for policy and value
77
+ - **Feature Normalization**: Stabilizes training across different market conditions
78
+
79
+ ### Model Constraints
80
+
81
+ - Fixed position size: $500 per trade
82
+ - Max exposure: $2,000 (up to 4 concurrent positions)
83
+ - Markets: 15-minute crypto prediction markets (BTC, ETH, SOL, XRP)
84
+
85
+ ### Observation Space (18 dimensions)
86
+
87
+ Fuses data from two sources into an 18-dimensional state:
88
+
89
+ | Category | Features |
90
+ |----------|----------|
91
+ | Momentum | 1m/5m/10m returns |
92
+ | Order flow | L1/L5 imbalance, trade flow, CVD acceleration |
93
+ | Microstructure | Spread %, trade intensity, large trade flag |
94
+ | Volatility | 5m vol, vol expansion ratio |
95
+ | Position | Has position, side, PnL, time remaining |
96
+ | Regime | Vol regime, trend regime |
97
+
98
+ ## Training Evolution
99
+
100
+ Five phases over three days. Each taught us something. Only the last earned a name.
101
+
102
+ ### Phase 1: Shaped Rewards (Failed)
103
+
104
+ **Duration**: ~52 min | **Trades**: 1,545 | **Result**: Policy collapse
105
+
106
+ Started with micro-bonuses to guide learning:
107
+ - +0.002 for trading with momentum
108
+ - +0.001 for larger positions
109
+ - -0.001 for fighting momentum
110
+
111
+ **What happened**: Entropy collapsed from 1.09 → 0.36. The agent learned to game the reward function—collect bonuses while ignoring actual profitability. Buffer showed 90% win rate while real trade win rate was 20%.
112
+
113
+ **Lesson**: Reward shaping is risky. When shaping rewards are gameable and similar magnitude to the real signal, agents optimize the wrong thing.
114
+
115
+ ### Phase 2: Pure Realized PnL
116
+
117
+ **Duration**: ~1 hour | **Trades**: 2,000+ | **Result**: 55% ROI
118
+
119
+ Stripped everything back:
120
+ - Reward ONLY on position close
121
+ - Increased entropy coefficient (0.05 → 0.10)
122
+ - Simplified actions (7 → 3)
123
+ - Smaller buffer (2048 → 512)
124
+
125
+ | Update | Entropy | PnL | Win Rate |
126
+ |--------|---------|-----|----------|
127
+ | 1 | 0.68 | $5.20 | 33.3% |
128
+ | 36 | 1.05 | $10.93 | 21.2% |
129
+
130
+ Win rate settled at 21%—below random (33%)—but profitable. Binary markets have asymmetric payoffs. (Still using probability-based PnL at this point.)
131
+
132
+ ### Phase 3: Scaled Up ($50 trades)
133
+
134
+ **Duration**: ~50 min | **Trades**: 4,133 | **Result**: -$64 → +$23
135
+
136
+ First update hit -$64 drawdown. But the agent recovered:
137
+
138
+ | Update | PnL | Win Rate |
139
+ |--------|-----|----------|
140
+ | 1 | -$63.75 | 29.5% |
141
+ | 36 | +$23.10 | 15.6% |
142
+
143
+ **Lesson**: The agent can recover from large adverse moves without policy collapse.
144
+
145
+ ### Phase 4: Share-Based PnL ($500 trades)
146
+
147
+ **Duration**: ~1 hour | **Trades**: 4,873 | **Result**: 170% ROI
148
+
149
+ Changed reward signal to reflect actual market economics:
150
+
151
+ ```python
152
+ # Old: probability-based
153
+ pnl = (exit_price - entry_price) * dollars
154
+
155
+ # New: share-based
156
+ shares = dollars / entry_price
157
+ pnl = (exit_price - entry_price) * shares
158
+ ```
159
+
160
+ | Update | PnL | Win Rate |
161
+ |--------|-----|----------|
162
+ | 1 | -$197 | 18.9% |
163
+ | 20 | -$465 | 18.5% |
164
+ | 46 | +$3,392 | 19.0% |
165
+
166
+ **4.5x improvement** over Phase 3's reward signal.
167
+
168
+ ### Phase 5: LACUNA (Final)
169
+
170
+ **Duration**: ~10 hours | **Trades**: 29,176 | **Result**: 2,510% ROI
171
+
172
+ Architecture rethink:
173
+ - **Temporal encoder**: 5-state history instead of single-frame
174
+ - **Asymmetric actor-critic**: Separate network capacities
175
+ - **Feature normalization**: Stable across market regimes
176
+
177
+ It started with a big loss. Seemed broken. Left it running on New Year's Eve while counting down to midnight—not out of hope, just neglect.
178
+
179
+ Checked back hours later. The equity curve had inflected. By morning: **+$50,195**.
180
+
181
+ Phases 1-4 might have done the same. They just never got the time. Phase 5 got the time it needed—because it had been written off.
182
+
183
+ ---
184
+
185
+ ## Emergent Behaviors
186
+
187
+ These weren't explicitly rewarded. The model discovered them while optimizing for profit—and we can see them evolve over time.
188
+
189
+ | Behavior | Early → Late | What It Learned |
190
+ |----------|--------------|-----------------|
191
+ | **Low volatility specialist** | Consistent | $4.07/trade on calm markets vs -$1.44 on volatile |
192
+ | **Hunts cheap outcomes** | 23% → 39% of trades | Cheap entries yield $8.63/trade vs $1.53 for expensive |
193
+ | **Rides DOWN momentum** | Consistent 77% | Bets DOWN when prob is falling → +$97k net profit |
194
+ | **Fat tail capture** | $5.8k → $20.5k net | Learned to position for asymmetric payoffs |
195
+ | **Recovery after loss streaks** | 47% WR after 3+ losses | Anti-tilt behavior (vs 24% baseline) |
196
+ | **Avg PnL per trade** | $1.62 → $4.30 | 2.7x improvement through genuine learning |
197
+
198
+ Consistent throughout: **Cuts winners fast** (0.35x hold time vs losers)—opposite of human intuition, but it works in these markets.
199
+
200
+ ## Key Takeaways
201
+
202
+ 1. **Reward shaping is dangerous** - When shaping rewards are gameable and similar magnitude to real signal, agents optimize the wrong thing. Sparse but honest > dense but noisy.
203
+
204
+ 2. **Reward signal design matters** - Share-based PnL outperformed probability-based by 4.5x ROI. Match actual market economics.
205
+
206
+ 3. **Entropy coefficient matters** - 0.05 caused policy collapse; 0.10 maintained healthy exploration.
207
+
208
+ 4. **Watch buffer/trade win rate divergence** - When these diverge, the agent is optimizing the wrong objective.
209
+
210
+ 5. **Give it time** - Learning and recovery from drawdowns take time. Early performance is not indicative of final results.
211
+
212
+ ---
213
+
214
+ ## The Story
215
+
216
+ This is our final checkpoint. We're done experimenting with LACUNA, but you don't have to be.
217
+
218
+ ## Usage
219
+
220
+ ```python
221
+ import torch
222
+ from safetensors.torch import load_file
223
+ import numpy as np
224
+ import json
225
+
226
+ # Load model weights
227
+ weights = load_file("model.safetensors")
228
+
229
+ # Load normalization stats (for preprocessing observations)
230
+ stats = np.load("normalization_stats.npz")
231
+ obs_mean = stats["obs_mean"]
232
+ obs_std = stats["obs_std"]
233
+
234
+ # Load config for architecture details
235
+ with open("config.json") as f:
236
+ config = json.load(f)
237
+
238
+ # Normalize observations before inference
239
+ def normalize_obs(obs):
240
+ return (obs - obs_mean) / (obs_std + 1e-8)
241
+ ```
242
+
243
+ ## Files
244
+
245
+ - `README.md` - This documentation
246
+ - `config.json` - Model configuration and architecture details
247
+ - `model.safetensors` - Model weights in SafeTensors format
248
+ - `normalization_stats.npz` - Observation normalization statistics
249
+ - `trades.csv` - All 29,176 trades with full details
250
+ - `updates.csv` - Training updates with metrics over time
251
+
252
+ ## Links
253
+
254
+ - [Live Results](https://humanplane.com/lacuna) - Interactive visualization
255
+ - [Training Code](https://github.com/humanplane/cross-market-state-fusion) - GitHub repository
256
+
257
+ ---
258
+
259
+ <div align="center">
260
+
261
+ ![LACUNA Results](lacuna-end.png)
262
+
263
+ *Final equity curve after 10 hours of live trading*
264
+
265
+ </div>
266
+
267
+ ---
268
+
269
+ ## License
270
+
271
+ MIT
272
+
273
+ ## Citation
274
+
275
+ ```bibtex
276
+ @misc{lacuna2025,
277
+ author = {HumanPlane},
278
+ title = {LACUNA: Cross-Market Data Fusion for Prediction Market Trading},
279
+ year = {2025},
280
+ publisher = {HuggingFace},
281
+ url = {https://huggingface.co/HumanPlane/LACUNA}
282
+ }
283
+ ```
config.json ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "model_type": "ppo_temporal",
3
+ "version": "5.0",
4
+ "name": "LACUNA",
5
+ "description": "Cross-market data fusion RL agent for prediction market trading",
6
+
7
+ "architecture": {
8
+ "type": "asymmetric_actor_critic",
9
+ "temporal_encoder": {
10
+ "history_length": 5,
11
+ "hidden_size": 128
12
+ },
13
+ "actor": {
14
+ "hidden_layers": [256, 128],
15
+ "activation": "tanh"
16
+ },
17
+ "critic": {
18
+ "hidden_layers": [256, 128],
19
+ "activation": "tanh"
20
+ }
21
+ },
22
+
23
+ "observation_space": {
24
+ "dimensions": 18,
25
+ "features": [
26
+ {"name": "return_1m", "category": "momentum"},
27
+ {"name": "return_5m", "category": "momentum"},
28
+ {"name": "return_10m", "category": "momentum"},
29
+ {"name": "l1_imbalance", "category": "order_flow"},
30
+ {"name": "l5_imbalance", "category": "order_flow"},
31
+ {"name": "trade_flow", "category": "order_flow"},
32
+ {"name": "cvd_acceleration", "category": "order_flow"},
33
+ {"name": "spread_pct", "category": "microstructure"},
34
+ {"name": "trade_intensity", "category": "microstructure"},
35
+ {"name": "large_trade_flag", "category": "microstructure"},
36
+ {"name": "vol_5m", "category": "volatility"},
37
+ {"name": "vol_expansion", "category": "volatility"},
38
+ {"name": "has_position", "category": "position"},
39
+ {"name": "position_side", "category": "position"},
40
+ {"name": "position_pnl", "category": "position"},
41
+ {"name": "time_remaining", "category": "position"},
42
+ {"name": "vol_regime", "category": "regime"},
43
+ {"name": "trend_regime", "category": "regime"}
44
+ ]
45
+ },
46
+
47
+ "action_space": {
48
+ "type": "discrete",
49
+ "actions": ["HOLD", "BUY_UP", "BUY_DOWN"]
50
+ },
51
+
52
+ "training": {
53
+ "algorithm": "PPO",
54
+ "entropy_coefficient": 0.10,
55
+ "learning_rate": 0.0003,
56
+ "gamma": 0.99,
57
+ "gae_lambda": 0.95,
58
+ "clip_range": 0.2,
59
+ "n_epochs": 10,
60
+ "batch_size": 64,
61
+ "buffer_size": 2048
62
+ },
63
+
64
+ "constraints": {
65
+ "position_size_usd": 500,
66
+ "max_exposure_usd": 2000,
67
+ "max_concurrent_positions": 4,
68
+ "markets": ["BTC", "ETH", "SOL", "XRP"],
69
+ "market_type": "15-minute crypto prediction"
70
+ },
71
+
72
+ "performance": {
73
+ "total_pnl_usd": 50195,
74
+ "return_on_exposure_pct": 2510,
75
+ "sharpe_ratio": 4.13,
76
+ "profit_factor": 1.21,
77
+ "total_trades": 29176,
78
+ "win_rate_pct": 23.9,
79
+ "runtime_hours": 10
80
+ }
81
+ }
lacuna-end.png ADDED

Git LFS Details

  • SHA256: 931c63d7bfbc589749b0b18b3ce3875519d8aa98b52734284870a332ac10e477
  • Pointer size: 132 Bytes
  • Size of remote file: 2.26 MB
lacuna.png ADDED

Git LFS Details

  • SHA256: 0cb5832a5c1a2dd0191f59f18ed6844edf33d9b63c163e7ac875b83ae914aa2b
  • Pointer size: 132 Bytes
  • Size of remote file: 1.02 MB
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0f910dfb6d82c9a2a660f8f335f976264bc78eb37537799c0306d7b773b8be3f
3
+ size 158112
normalization_stats.npz ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:16ee9885e5d54ab1056e0ec86960823a6a8a125b8c328674afc262e3a798202d
3
+ size 2642
trades.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:52ea3afd7056bdb48d5a392b54186b47da8e28953ab06df23c4dedfde0965eef
3
+ size 4000486
updates.csv ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:aecb2a6f393edd51f7d3a0d4e316e2eba2cdd6df0c029d31b236828555145e38
3
+ size 177966