hamverbot commited on
Commit
7028d46
Β·
verified Β·
1 Parent(s): 3c25a45

Upload RESEARCH_RESOURCES.md

Browse files
Files changed (1) hide show
  1. RESEARCH_RESOURCES.md +565 -0
RESEARCH_RESOURCES.md ADDED
@@ -0,0 +1,565 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # RTB Bidding Algorithm Comparison β€” Complete Research Resource List
2
+
3
+ > Generated: 2026-05-05 | Author: ML Intern for hamverbot
4
+ > Repository: https://huggingface.co/hamverbot/rtb-bidding-comparison
5
+
6
+ ---
7
+
8
+ ## Table of Contents
9
+
10
+ 1. [Bidding Algorithms](#1-bidding-algorithms)
11
+ 2. [CTR Prediction Models](#2-ctr-prediction-models)
12
+ 3. [Clearing Price / Market Price Prediction](#3-clearing-price--market-price-prediction)
13
+ 4. [Datasets](#4-datasets)
14
+ 5. [Codebases & Implementations](#5-codebases--implementations)
15
+ 6. [Benchmark Leaderboards](#6-benchmark-leaderboards)
16
+ 7. [Recommended Architecture](#7-recommended-architecture)
17
+
18
+ ---
19
+
20
+ ## 1. Bidding Algorithms
21
+
22
+ ### 1.1 Lagrangian Dual + Online Gradient Descent (BEST MATCH)
23
+
24
+ | Property | Detail |
25
+ |----------|--------|
26
+ | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
27
+ | **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong (2023) |
28
+ | **Venue** | NeurIPS 2023 (implied) |
29
+ | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
30
+ | **HF Papers** | https://huggingface.co/papers/2304.13477 |
31
+ | **Algorithm** | DualOGD β€” Lagrangian dual multiplier updated by online error gradient descent |
32
+ | **Auction Type** | First-price (also handles second-price) |
33
+ | **Constraints** | Budget cap: total spend ≀ ρT |
34
+ | **Regret Bound** | Γ•(√T) for both full-information and one-sided feedback |
35
+ | **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t βˆ’ Ρ·(ρ βˆ’ cΜƒ_t(b_t))) |
36
+ | **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ Ξ»_tΒ·cΜƒ_t(b)) |
37
+ | **Prediction Models Needed** | CTR predictor (for v_t), empirical CDF of competing bids (G̃) |
38
+ | **Why It's The Best Match** | You explicitly described "Lagrangian dual multiplier and updating the dual variables online by error gradient descent" β€” this is exactly Algorithm 1, line 7. |
39
+
40
+ ### 1.2 Dual Mirror Descent (Second-Price)
41
+
42
+ | Property | Detail |
43
+ |----------|--------|
44
+ | **Paper** | "The Best of Many Worlds: Dual Mirror Descent for Online Allocation Problems" |
45
+ | **Authors** | Santiago Balseiro, Haihao Lu, Vahab Mirrokni (2020) |
46
+ | **Venue** | Operations Research (2023) / NeurIPS 2020 Workshop |
47
+ | **arXiv** | [2011.10124](https://arxiv.org/abs/2011.10124) |
48
+ | **HF Papers** | https://huggingface.co/papers/2011.10124 |
49
+ | **Citations** | 135+ |
50
+ | **Algorithm** | Dual mirror descent β€” generalizes OGD with Bregman divergences |
51
+ | **Auction Type** | Second-price (truthful) |
52
+ | **Bid Rule** | b_t = v_t / (1 + ΞΌ_t) |
53
+ | **Dual Update** | ΞΌ_{t+1} = Proj(ΞΌ_t βˆ’ Ξ·Β·(ρ βˆ’ payment_t)) |
54
+ | **Key Insight** | In second-price auctions, you don't need a market price model. The dual multiplier naturally paces spending. |
55
+ | **Prediction Models** | CTR predictor only (no market price model needed) |
56
+
57
+ ### 1.3 Dual Descent with RoS + Budget (Multi-Constraint)
58
+
59
+ | Property | Detail |
60
+ |----------|--------|
61
+ | **Paper** | "Online Bidding Algorithms for Return-on-Spend Constrained Advertisers" |
62
+ | **Authors** | Zhe Feng, Swati Padmanabhan, Di Wang (2022) |
63
+ | **Venue** | ICML 2022 |
64
+ | **arXiv** | [2208.13713](https://arxiv.org/abs/2208.13713) |
65
+ | **Citations** | 38+ |
66
+ | **Algorithm** | Two dual variables: Ξ» for RoS, ΞΌ for budget |
67
+ | **Bid Rule** | b_t = ((1+Ξ»_t)/(ΞΌ_t+Ξ»_t)) Β· v_t |
68
+ | **Updates** | Ξ»_{t+1} = Ξ»_tΒ·exp(-Ξ±Β·(v_tΒ·x_t(b_t) βˆ’ p_t(b_t))) [multiplicative]; ΞΌ_{t+1} = Proj(ΞΌ_t βˆ’ Ξ·Β·(ρ βˆ’ p_t(b_t))) [sub-gradient] |
69
+ | **Key Insight** | Can be adapted for your "ensure k% spend" floor β€” use second dual variable to enforce minimum spend |
70
+ | **Prediction Models** | CTR predictor (v_t), payment observed |
71
+
72
+ ### 1.4 RLB β€” Reinforcement Learning Bidding
73
+
74
+ | Property | Detail |
75
+ |----------|--------|
76
+ | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
77
+ | **Authors** | Han Cai, Kan Ren, Weinan Zhang, Kleanthis Malialis, Jun Wang, Yong Yu, Defeng Guo (2017) |
78
+ | **Venue** | WSDM 2017 |
79
+ | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
80
+ | **HF Papers** | https://huggingface.co/papers/1701.02490 |
81
+ | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
82
+ | **Algorithm** | MDP + Dynamic Programming + Neural value function approximation |
83
+ | **State** | (t remaining auctions, b remaining budget, x feature vector) |
84
+ | **Action** | bid price a ∈ [0, b] |
85
+ | **Results** | +22% clicks over linear bidding at tight budgets on iPinYou |
86
+ | **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
87
+ | **Key Insight** | Foundational; explicitly models the budget-depletion tradeoff via DP. Superseded by dual methods for budget pacing but still influential. |
88
+
89
+ ### 1.5 HiBid β€” Industrial Hierarchical Dual-RL
90
+
91
+ | Property | Detail |
92
+ |----------|--------|
93
+ | **Paper** | "HiBid: A Cross-Channel Constrained Bidding System with Budget Allocation by Hierarchical Offline Deep Reinforcement Learning" |
94
+ | **Authors** | Yuhang Wang et al. (2023) |
95
+ | **arXiv** | [2312.17503](https://arxiv.org/abs/2312.17503) |
96
+ | **HF Papers** | https://huggingface.co/papers/2312.17503 |
97
+ | **Algorithm** | High-level RL budget allocation + Low-level Ξ»-parameterized bidding |
98
+ | **Scale** | 64K advertisers, 70M requests/day, 4 channels, deployed at Meituan |
99
+ | **Results** | Outperforms RL-based baselines (R-BCQ, BCQ, CQL) on clicks, CPC, CSR, ROI |
100
+
101
+ ### 1.6 Contextual First-Price Extension (Very Recent!)
102
+
103
+ | Property | Detail |
104
+ |----------|--------|
105
+ | **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
106
+ | **Authors** | (2026) |
107
+ | **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
108
+ | **Algorithm** | Dual OGD + quantile-based contextual censored regression |
109
+ | **Key Innovation** | Extends Wang et al. (2023) to handle contextual (feature-based) auctions with a novel quantile trick for censored data |
110
+ | **Regret** | Γ•(√T) in contextual first-price auctions |
111
+
112
+ ### 1.7 Unified View of Lagrangian Dual Multiplier Methods
113
+
114
+ All dual methods follow the same template:
115
+
116
+ ```
117
+ For each auction t:
118
+ 1. Observe value v_t (from CTR prediction Γ— click value)
119
+ 2. Compute bid: b_t = f(v_t, dual_multiplier_t)
120
+ 3. Observe outcome: payment c_t (if won) or 0 (if lost)
121
+ 4. Compute gradient: g_t = ρ βˆ’ c_t
122
+ 5. Update multiplier: Ξ»_{t+1} = Proj_{Ξ»β‰₯0}(Ξ»_t βˆ’ Ξ·Β·g_t)
123
+ ```
124
+
125
+ | Method | Auction | Bid Function f(v, Ξ») |
126
+ |--------|---------|----------------------|
127
+ | Wang 2023 | First-price | argmax_b (rΜƒ(v,b) βˆ’ λ·cΜƒ(b)) |
128
+ | Balseiro 2020 | Second-price | v / (1+Ξ») |
129
+ | Feng 2022 | Second-price | ((1+Ξ»_RoS)/(Ξ»_RoS+Ξ»_budget)) Β· v |
130
+
131
+ ### 1.8 Additional Papers (Supplementary)
132
+
133
+ | Paper | Key Idea | arXiv |
134
+ |-------|----------|-------|
135
+ | Dynamic Budget Throttling | Throttle participation rate to control spend | 2207.04690 |
136
+ | No-Regret Learning in Repeated First-Price Auctions | General no-regret framework for first-price | 2205.14572 |
137
+ | Robust Budget Pacing with a Single Sample | Near-optimal regret from 1 sample per distribution | 2302.02006 |
138
+ | Learning to Bid Optimally in Adversarial First-Price | Adversarial (non-i.i.d.) setting | 2007.04568 |
139
+ | Optimal No-Regret Learning in Repeated FPA | Minimax optimal bounds | 2003.09795 |
140
+ | Multi-Channel Autobidding with Budget and ROI | Per-channel optimization optimality | 2302.01523 |
141
+ | Leveraging the Hints: Adaptive Bidding | Uses hints/forecasts for better bidding | 2211.06358 |
142
+ | Adaptive Bidding under Non-stationarity | Handles distribution shift | 2505.02796 |
143
+ | Joint Value Estimation and Bidding | Simultaneous CTR learning + bidding | 2502.17292 |
144
+ | No-Regret is not enough! | Adaptive regret for constrained bandits | 2405.06575 |
145
+ | AIGB: Generative Auto-bidding | Diffusion models for bid trajectory generation | 2405.16141 |
146
+
147
+ ### Two-Sided Budget Constraint (Your Specific Need)
148
+
149
+ You need: **maximize clicks s.t. spend ≀ B AND spend β‰₯ kΒ·B**.
150
+
151
+ This requires two dual variables:
152
+ - **ΞΌ** for the budget cap: ΞΌ_{t+1} = Proj(ΞΌ_t βˆ’ η₁·(ρ βˆ’ spend_t))
153
+ - **Ξ½** for the spend floor: Ξ½_{t+1} = Proj(Ξ½_t βˆ’ Ξ·β‚‚Β·(spend_t βˆ’ kρ))
154
+
155
+ Bid function: b_t = v_t Β· f(ΞΌ_t, Ξ½_t) where f decreases with ΞΌ and increases with Ξ½.
156
+
157
+ ---
158
+
159
+ ## 2. CTR Prediction Models
160
+
161
+ ### 2.1 FinalMLP (RECOMMENDED β€” Best AUC, Fastest Inference)
162
+
163
+ | Property | Detail |
164
+ |----------|--------|
165
+ | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
166
+ | **Authors** | Kelong Mao, Jieming Zhu, Liangcai Su, Guohao Cai, Yuru Li, Zhenhua Dong (2023) |
167
+ | **Venue** | AAAI 2023 |
168
+ | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
169
+ | **HF Papers** | https://huggingface.co/papers/2304.00902 |
170
+ | **Datasets** | reczoo/Criteo_x1, reczoo/Avazu_x1, reczoo/MovielensLatest_x1, reczoo/Frappe_x1 |
171
+ | **Criteo AUC** | **0.8149** |
172
+ | **Avazu AUC** | **0.7666** |
173
+ | **Architecture** | Two-stream MLP: two independent MLP towers + feature gating (soft selection) + bilinear fusion |
174
+ | **Inference Speed** | Fastest among SOTA (pure MLP, ~400-dim hidden, no attention) |
175
+ | **Why Best for RTB** | Pure feed-forward, <1ms inference, easy to deploy |
176
+
177
+ ### 2.2 GDCN β€” Gated Deep Cross Network
178
+
179
+ | Property | Detail |
180
+ |----------|--------|
181
+ | **Paper** | "Towards Deeper, Lighter and Interpretable Cross Network for CTR Prediction" |
182
+ | **Authors** | Fangye Wang, Hansu Gu, Dongsheng Li, Tun Lu, Peng Zhang, Ning Gu (2023) |
183
+ | **Venue** | CIKM 2023 |
184
+ | **arXiv** | [2311.04635](https://arxiv.org/abs/2311.04635) |
185
+ | **HF Papers** | https://huggingface.co/papers/2311.04635 |
186
+ | **Criteo AUC** | **0.8161** (own split β€” not directly comparable) |
187
+ | **Architecture** | DCNv2 + learned information gate per cross layer + Field-level Dimension Optimization (FDO) |
188
+ | **Key Insight** | Gate filters noisy interactions; FDO compresses embeddings 60%+. Good for memory-constrained RTB. |
189
+
190
+ ### 2.3 DCNv2 β€” Industry Workhorse
191
+
192
+ | Property | Detail |
193
+ |----------|--------|
194
+ | **Paper** | "DCN V2: Improved Deep & Cross Network and Practical Lessons for Web-scale Learning to Rank Systems" |
195
+ | **Authors** | Ruoxi Wang, Rakesh Shivanna, Derek Z. Cheng, Sagar Jain, Dong Lin, Lichan Hong, Ed H. Chi (2021) |
196
+ | **Venue** | WWW 2021 |
197
+ | **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
198
+ | **HF Papers** | https://huggingface.co/papers/2008.13535 |
199
+ | **Criteo AUC** | **0.8142-0.8144** (retuned) |
200
+ | **Architecture** | Embedding β†’ parallel CrossNetV2 + DNN β†’ concat β†’ sigmoid |
201
+ | **Key Insight** | Mixture-of-Experts-style low-rank decomposition. Battle-tested at Google. |
202
+
203
+ ### 2.4 DeepFM β€” Simple, Strong Baseline
204
+
205
+ | Property | Detail |
206
+ |----------|--------|
207
+ | **Paper** | "DeepFM: A Factorization-Machine based Neural Network for CTR Prediction" |
208
+ | **Authors** | Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, Xiuqiang He (2017) |
209
+ | **Venue** | IJCAI 2017 |
210
+ | **Criteo AUC** | **0.8138** (retuned) |
211
+ | **Architecture** | Shared embedding β†’ parallel FM (2nd-order) + DNN β†’ sum β†’ sigmoid |
212
+ | **Key Insight** | Shared embedding between FM and DNN is the secret. End-to-end, no pre-training. |
213
+
214
+ ### 2.5 FCN β€” Fusing Cross Network (Most Recent)
215
+
216
+ | Property | Detail |
217
+ |----------|--------|
218
+ | **Paper** | "FCN: Fusing Exponential and Linear Cross Network for Click-Through Rate Prediction" |
219
+ | **Authors** | (2024) |
220
+ | **arXiv** | [2407.13349](https://arxiv.org/abs/2407.13349) |
221
+ | **HF Papers** | https://huggingface.co/papers/2407.13349 |
222
+ | **Architecture** | Two explicit cross sub-networks: LCN (linear, order grows linearly) + ECN (exponential, order doubles per layer) |
223
+ | **Key Insight** | No DNN needed β€” all interactions explicit. 50% fewer params, 23% lower latency than DCNv2. |
224
+ | **Caveat** | Newer paper with less community validation. GitHub: https://github.com/salmon1802/FCN |
225
+
226
+ ### 2.6 BARS Meta-Finding
227
+
228
+ | Property | Detail |
229
+ |----------|--------|
230
+ | **Paper** | "BARS-CTR: Open Benchmarking for Click-Through Rate Prediction" |
231
+ | **Authors** | Jieming Zhu, Jinyang Liu, Shuai Yang, Qi Zhang, Xiuqiang He (2021) |
232
+ | **Venue** | CIKM 2021 |
233
+ | **arXiv** | [2009.05794](https://arxiv.org/abs/2009.05794) |
234
+ | **HF Papers** | https://huggingface.co/papers/2009.05794 |
235
+ | **Key Result** | After 7,000+ experiments and 12,000 GPU hours: **differences between SOTA deep CTR models are surprisingly small** (~0.1-0.3% AUC). Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering. All models converge to ~0.814 AUC on Criteo after proper tuning. |
236
+
237
+ ### 2.7 Additional CTR Papers
238
+
239
+ | Paper | Key Idea | arXiv |
240
+ |-------|----------|-------|
241
+ | DIN (KDD 2018) | Attention over user behavior sequence | 1706.06978 |
242
+ | DIEN (AAAI 2019) | Interest evolution with GRU + attention | 1809.03672 |
243
+ | xDeepFM (KDD 2018) | Compressed Interaction Network (CIN) for vector-wise crosses | 1803.05170 |
244
+ | AutoInt (CIKM 2019) | Multi-head self-attention for feature interactions | 1810.11921 |
245
+ | DLRM (Meta, 2019) | Specialized for recommendation: MLP for dense + embedding for sparse | 1906.00091 |
246
+ | Wide & Deep (Google, 2016) | Memorization (wide) + generalization (deep) | 1606.07792 |
247
+ | FTRL-Proximal (KDD 2013) | "Ad Click Prediction: a View from the Trenches" β€” online learning for linear CTR | β€” |
248
+ | Streaming CTR (2023) | Online CTR with non-stationary data | 2307.07509 |
249
+
250
+ ### 2.8 Latency Considerations for RTB
251
+
252
+ | Model | Architecture | Inference Speed | RTB-Suitable |
253
+ |-------|-------------|-----------------|--------------|
254
+ | **FinalMLP** | Pure MLP | ⭐⭐⭐⭐⭐ (<1ms) | βœ… Best |
255
+ | **DCNv2** | CrossNet + DNN | ⭐⭐⭐⭐ | βœ… |
256
+ | **GDCN** | Gated Cross + DNN | ⭐⭐⭐⭐ | βœ… |
257
+ | **DeepFM** | FM + DNN | ⭐⭐⭐⭐ | βœ… |
258
+ | **FCN** | LCN + ECN (no DNN) | ⭐⭐⭐⭐ | βœ… |
259
+ | DIN | Attention (user history) | ⭐⭐ | ❌ Too slow |
260
+ | DIEN | GRU + attention | ⭐ | ❌ Too slow |
261
+ | AutoInt | Multi-head attention | ⭐⭐ | ❌ Too slow |
262
+
263
+ ---
264
+
265
+ ## 3. Clearing Price / Market Price Prediction
266
+
267
+ ### 3.1 Non-Parametric Empirical CDF (RECOMMENDED BASELINE)
268
+
269
+ | Property | Detail |
270
+ |----------|--------|
271
+ | **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
272
+ | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
273
+ | **Method** | Maintain array of observed competing bids d_s, estimate GΜƒ_t(b) = (1/(t-1))βˆ‘πŸ™{b β‰₯ d_s} |
274
+ | **Win Probability** | P(win\|b) = G̃_t(b) |
275
+ | **Expected Cost** | E[cost\|win,b] = (1/GΜƒ_t(b)) Β· mean of {d_s : d_s ≀ b} |
276
+ | **Pros** | No model training needed, theoretically sound (Γ•(√T) regret), handles distribution shift naturally |
277
+ | **Cons** | No context/features, cold-start when t is small |
278
+
279
+ ### 3.2 Censored Linear Regression (Wu et al. 2015)
280
+
281
+ | Property | Detail |
282
+ |----------|--------|
283
+ | **Paper** | "Predicting Winning Price in Real Time Bidding with Censored Data" |
284
+ | **Authors** | Wush Chi-Hsuan Wu, Mi-Yen Yeh, Ming-Syan Chen (2015) |
285
+ | **Venue** | KDD 2015 |
286
+ | **Citations** | ~101 |
287
+ | **Method** | Tobit-like model: log(market_price) = Ξ²Β·x + Ξ΅, Ξ΅ ~ N(0, σ²) |
288
+ | **Key Insight** | Properly handles censoring via likelihood: winning auctions contribute f(price\|x), losing auctions contribute S(bid\|x) |
289
+ | **Pros** | Contextual, simple, computationally cheap |
290
+ | **Cons** | Linear model β€” limited capacity for complex interactions |
291
+
292
+ ### 3.3 Deep Censored Learning / Survival Analysis
293
+
294
+ | Property | Detail |
295
+ |----------|--------|
296
+ | **Paper** | "Deep Censored Learning of the Winning Price" (Zhu et al., WWW 2019) |
297
+ | **Method** | Neural network trained with censored survival loss |
298
+ | **Loss** | Winning: -log f(price\|x); Losing: -log S(bid\|x) |
299
+ | **Library** | **TorchSurv** ([arXiv:2404.10761](https://arxiv.org/abs/2404.10761), Novartis, 200β˜… GitHub) |
300
+ | **TorchSurv URL** | https://github.com/Novartis/torchsurv |
301
+ | **TorchSurv Docs** | https://opensource.nibr.com/torchsurv/ |
302
+ | **PyPI** | `pip install torchsurv` |
303
+ | **Key Insight** | Proper survival framework handles censoring. Win = exact price observed (uncensored). Loss = only lower bound (censored at bid). |
304
+ | **Architecture** | Deep FC predicting either hazard rate Ξ»(t\|x) (Cox PH) or distribution parameters (Weibull/log-normal AFT) |
305
+
306
+ ```python
307
+ # TorchSurv pattern for market price:
308
+ from torchsurv.loss import cox
309
+ log_hazard = model(features) # shape (batch,)
310
+ # event=1 if won, 0 if lost (censored)
311
+ # time = market_price if won, bid if lost
312
+ loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
313
+ ```
314
+
315
+ ### 3.4 Win Probability Neural Network (Simplest ML)
316
+
317
+ | Property | Detail |
318
+ |----------|--------|
319
+ | **Method** | Direct binary classification: P(win\|bid_price, features) |
320
+ | **Pros** | Dead simple, works with standard BCELoss |
321
+ | **Cons** | Ignores censored price info when you win β€” only uses binary win/loss signal |
322
+ | **Input** | features + bid_price β†’ sigmoid |
323
+
324
+ ### 3.5 Parametric Distribution Fitting
325
+
326
+ | Property | Detail |
327
+ |----------|--------|
328
+ | **Paper** | Referenced in RLB (Cai et al. 2017) β€” "Functional Bid Landscape Forecasting" (ECML-PKDD 2016) |
329
+ | **Method** | Fit log-normal or gamma distribution to observed winning prices; predict parameters from features using GBDT |
330
+ | **Pros** | Parametric assumptions reduce variance |
331
+ | **Cons** | Distribution assumption may not hold; doesn't properly handle censoring |
332
+
333
+ ### 3.6 Contextual Quantile-Based (2026)
334
+
335
+ | Property | Detail |
336
+ |----------|--------|
337
+ | **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
338
+ | **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
339
+ | **Method** | Models competing bid as d_t = Ξ±Β·x_t + z_t (linear contextual); quantile-based estimator for Ξ± |
340
+ | **Key Trick** | Splits samples by bid quantile and exploits identifiable conditional quantiles to circumvent full censoring |
341
+ | **Pros** | Theoretical guarantees in contextual setting |
342
+ | **Cons** | Linear contextual model only; very recent |
343
+
344
+ ### 3.7 Comparison Summary
345
+
346
+ | Method | Contextual? | Handles Censoring? | Model Training? | Complexity |
347
+ |--------|-------------|-------------------|-----------------|------------|
348
+ | Empirical CDF | ❌ | N/A (full info) | None | Minimal |
349
+ | Censored Linear Reg | βœ… | βœ… (proper likelihood) | Linear model | Low |
350
+ | Deep Survival (TorchSurv) | βœ… | βœ… (proper likelihood) | Neural net | Medium |
351
+ | Win Prob Classifier | βœ… | ❌ (binary only) | Neural net | Low |
352
+ | Parametric (log-normal) | Optional | ❌ | GBDT | Medium |
353
+ | Quantile Censored | βœ… | βœ… (quantile trick) | Linear | Medium-High |
354
+
355
+ ---
356
+
357
+ ## 4. Datasets
358
+
359
+ ### 4.1 CTR Prediction Datasets
360
+
361
+ | Dataset | HF Hub Path | Size | Fields | Label | Verified |
362
+ |---------|------------|------|--------|-------|----------|
363
+ | **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M rows, 5.6GB | 13 dense (I1-I13) + 26 categorical (C1-C26) | `Label` (0/1) | βœ… |
364
+ | **Avazu_x4** | `reczoo/Avazu_x4` | 40.4M rows, 1.8GB | 22 fields (mixed) | `click` (0/1) | βœ… |
365
+ | Criteo_x1 | `reczoo/Criteo_x1` | ~11M rows | Same as x4 | `Label` | βœ… |
366
+ | Avazu_x1 | `reczoo/Avazu_x1` | ~10M rows | Same as x4 | `click` | βœ… |
367
+
368
+ **Standard split**: 80% train / 10% val / 10% test (BARS protocol).
369
+
370
+ ### 4.2 RTB Bidding Datasets
371
+
372
+ | Dataset | Source | Size | Format | Availability |
373
+ |---------|--------|------|--------|-------------|
374
+ | **iPinYou** | data.computational-advertising.org | 19.5M impressions, 9 campaigns, 10 days (2013) | Bid logs with market price | External download only (NOT on HF Hub) |
375
+ | **YOYI** | Various academic mirrors | ~400M records | Bid logs | External download only |
376
+
377
+ **iPinYou format**: `(click, paying_price, bid_price, slot_id, user_tags, ...)` β€” already includes market price info needed for bidding simulation.
378
+
379
+ **Key Gap**: No RTB bid-log datasets on HuggingFace Hub. Criteo/Avazu have click labels but no bid/price columns β€” they can only be used for CTR training and require synthetic price generation for bidding evaluation.
380
+
381
+ ### 4.3 Data Requirements for Each Algorithm
382
+
383
+ | Algorithm | Needs from Dataset |
384
+ |-----------|-------------------|
385
+ | Dual OGD (Wang) | click labels (CTR training) + competing bids (or synthetic prices for simulation) |
386
+ | Dual Mirror Descent (Balseiro) | click labels + auction payment (second-price) |
387
+ | RLB (Cai) | click labels + market prices + impression features |
388
+ | CTR models (all) | click labels + features (Criteo/Avazu: βœ…) |
389
+ | Clearing price models | observed prices (won auctions) + bids (lost auctions) |
390
+
391
+ ---
392
+
393
+ ## 5. Codebases & Implementations
394
+
395
+ ### 5.1 CTR Model Libraries
396
+
397
+ | Library | URL | Models | Framework | Notes |
398
+ |---------|-----|--------|-----------|-------|
399
+ | **FuxiCTR** | https://github.com/reczoo/FuxiCTR | 40+ (FinalMLP, DeepFM, DCNv2, GDCN, FCN, xDeepFM, AutoInt) | PyTorch | Config-driven (YAML). Used by all SOTA benchmark papers. |
400
+ | **DeepCTR-Torch** | https://github.com/shenweichen/DeepCTR-Torch | 20+ (DeepFM, DCN, DIN, DIEN, xDeepFM) | PyTorch | Simpler API (Python class). Good for quick prototyping. |
401
+ | **TorchSurv** | https://github.com/Novartis/torchsurv | Cox PH, Weibull AFT, DeepSurv, DeepHit | PyTorch | Deep survival analysis for clearing price. |
402
+ | **BARS** | https://github.com/openbenchmark/BARS | Benchmarking | β€” | Standardized evaluation pipeline. 389β˜… |
403
+
404
+ ### 5.2 Bidding Algorithm Implementations
405
+
406
+ | Repo | URL | Algorithms | Notes |
407
+ |------|-----|------------|-------|
408
+ | **rlb-dp** | https://github.com/han-cai/rlb-dp | RLB (MDP + DP) | 188 stars. Original implementation of RL for RTB. |
409
+ | **budget_constrained_bidding** | https://github.com/dingmu365/budget_constrained_bidding | Budget-constrained RTB | Contains multiple budget-constrained bidding algorithms. |
410
+ | **budget_constrained_bidding** (fork) | https://github.com/GinNie23/budget_constrained_bidding | Same | Fork with modifications. |
411
+ | **Budget_Constrained_Bidding** | https://github.com/venkatacrc/Budget_Constrained_Bidding | Same | Another implementation. |
412
+ | **hamverbot/rtb-bidding-comparison** | https://huggingface.co/hamverbot/rtb-bidding-comparison | DualOGD, Linear, ORTB, Threshold, MPC | **Your repo** β€” already has a working comparison framework! |
413
+
414
+ ### 5.3 FuxiCTR Quick Start
415
+
416
+ ```bash
417
+ pip install fuxictr
418
+ ```
419
+
420
+ ```yaml
421
+ # config/criteo_finalmlp.yaml
422
+ dataset_id: Criteo_x4
423
+ model: FinalMLP
424
+ embedding_dim: 10
425
+ hidden_units: [400, 400, 400]
426
+ batch_size: 4096
427
+ learning_rate: 1e-3
428
+ epochs: 10
429
+ metrics: [auc, logloss]
430
+ ```
431
+
432
+ ```python
433
+ from fuxictr import autotuner
434
+ autotuner.run("config/criteo_finalmlp.yaml", "Criteo_x4", "FinalMLP")
435
+ ```
436
+
437
+ ### 5.4 DeepCTR-Torch Quick Start
438
+
439
+ ```bash
440
+ pip install deepctr-torch
441
+ ```
442
+
443
+ ```python
444
+ from deepctr_torch.models import DeepFM
445
+ from deepctr_torch.inputs import SparseFeat, DenseFeat
446
+
447
+ sparse_features = [SparseFeat(f, vocab_size=df[f].nunique(), embedding_dim=10)
448
+ for f in categorical_cols]
449
+ dense_features = [DenseFeat(f, 1) for f in numerical_cols]
450
+
451
+ model = DeepFM(linear_feature_columns=sparse_features + dense_features,
452
+ dnn_feature_columns=sparse_features + dense_features,
453
+ dnn_hidden_units=(400, 400, 400), device='cuda')
454
+ model.compile('adam', 'binary_crossentropy', metrics=['auc'])
455
+ model.fit(train_input, train_labels, batch_size=4096, epochs=10)
456
+ ```
457
+
458
+ ---
459
+
460
+ ## 6. Benchmark Leaderboards
461
+
462
+ | Leaderboard | URL | Description |
463
+ |-------------|-----|-------------|
464
+ | **BARS CTR Criteo_x4** | https://openbenchmark.github.io/BARS/CTR/leaderboard/criteo_x4.html | Definite CTR benchmark β€” 24 models compared |
465
+ | **BARS CTR Criteo_x1** | https://openbenchmark.github.io/BARS/CTR/leaderboard/criteo_x1.html | Smaller Criteo subset |
466
+ | **BARS CTR Avazu** | https://openbenchmark.github.io/BARS/CTR/leaderboard/avazu_x4.html | Avazu benchmark |
467
+ | **BARS Main** | https://openbenchmark.github.io/BARS | Full recommender systems benchmark |
468
+
469
+ **Top Criteo_x4 AUC scores (from BARS):**
470
+ - FinalMLP: 0.8149
471
+ - DCNv2: 0.8142
472
+ - DeepFM: 0.8138
473
+ - xDeepFM: 0.8136
474
+ - AutoInt+: 0.8134
475
+
476
+ Key takeaway: Top 5 models are within 0.15% AUC of each other.
477
+
478
+ ---
479
+
480
+ ## 7. Recommended Architecture
481
+
482
+ ### For Your Problem: "Lagrangian Dual Multiplier with Online Error Gradient Descent"
483
+
484
+ ```
485
+ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
486
+ β”‚ BIDDING ALGORITHM β”‚
487
+ β”‚ β”‚
488
+ β”‚ Dual OGD (Wang et al. 2023) β”‚
489
+ β”‚ Ξ»_{t+1} = Proj(Ξ»_t - Ρ·(ρ - cΜƒ_t(b_t))) β”‚
490
+ │ b_t = argmax_b (r̃_t(v_t, b) - λ_t·c̃_t(b)) │
491
+ β”‚ β”‚
492
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
493
+ β”‚ PREDICTION MODELS β”‚
494
+ β”‚ β”‚
495
+ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
496
+ β”‚ β”‚ CTR Predictor β”‚ β”‚ Clearing Price Est. οΏ½οΏ½οΏ½ β”‚
497
+ β”‚ β”‚ (FinalMLP) β”‚ β”‚ (Empirical CDF β”‚ β”‚
498
+ β”‚ β”‚ β”‚ β”‚ OR TorchSurv) β”‚ β”‚
499
+ │ │ v_t = pCTR × V │ │ G̃(b) = P(win | b) │ │
500
+ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
501
+ β”‚ β”‚
502
+ β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
503
+ β”‚ DATASETS β”‚
504
+ β”‚ β”‚
505
+ β”‚ Criteo_x4 (CTR training) + iPinYou (bidding simulation) β”‚
506
+ β”‚ OR: Criteo_x4 + synthetic price generation β”‚
507
+ β”‚ β”‚
508
+ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
509
+ ```
510
+
511
+ ### Implementation Priority
512
+
513
+ 1. **Phase 1**: Improve CTR model β€” replace current LogisticRegression with FinalMLP trained on Criteo_x4 (via FuxiCTR)
514
+ 2. **Phase 2**: Improve clearing price β€” add TorchSurv-based censored regression alongside current empirical CDF
515
+ 3. **Phase 3**: Add Balseiro dual mirror descent for comparison (simpler baseline, no market price model)
516
+ 4. **Phase 4**: Add two-sided budget constraint (cap + floor) with dual dual variables
517
+ 5. **Phase 5**: Full sweep over hyperparameters: step size Ξ΅, budget fraction k%, value per click, CTR model architecture
518
+
519
+ ### Online Learning Note
520
+
521
+ For production RTB where the environment is non-stationary, implement periodic retraining:
522
+ - Save model checkpoint every N hours
523
+ - Reload and train on sliding window of most recent data
524
+ - Deploy updated model without restarting the bidding algorithm
525
+
526
+ The Lagrangian multiplier Ξ» is intrinsically online (updated per auction). The CTR model needs separate periodic retraining.
527
+
528
+ ---
529
+
530
+ ## Paper Index (All Papers Referenced)
531
+
532
+ | # | Paper | arXiv | Venue | Year | Citations |
533
+ |---|-------|-------|-------|------|-----------|
534
+ | 1 | Wang et al. β€” Learning to Bid in Repeated First-Price Auctions with Budgets | 2304.13477 | NeurIPS | 2023 | Growing |
535
+ | 2 | Balseiro et al. β€” Dual Mirror Descent for Online Allocation | 2011.10124 | Ops Research | 2020 | 135+ |
536
+ | 3 | Feng et al. β€” Online Bidding for RoS Constrained Advertisers | 2208.13713 | ICML | 2022 | 38+ |
537
+ | 4 | Cai et al. β€” RTB by RL in Display Advertising | 1701.02490 | WSDM | 2017 | 300+ |
538
+ | 5 | Wang et al. β€” HiBid Hierarchical DRL Bidding | 2312.17503 | β€” | 2023 | New |
539
+ | 6 | β€” Online Bidding for Contextual First-Price (Quantile) | 2603.07207 | β€” | 2026 | New |
540
+ | 7 | Mao et al. β€” FinalMLP | 2304.00902 | AAAI | 2023 | Growing |
541
+ | 8 | Wang et al. β€” GDCN | 2311.04635 | CIKM | 2023 | Growing |
542
+ | 9 | Wang et al. β€” DCN V2 | 2008.13535 | WWW | 2021 | 500+ |
543
+ | 10 | Guo et al. β€” DeepFM | β€” | IJCAI | 2017 | 3000+ |
544
+ | 11 | β€” FCN: Fusing Cross Network | 2407.13349 | β€” | 2024 | New |
545
+ | 12 | Zhu et al. β€” BARS-CTR Benchmark | 2009.05794 | CIKM | 2021 | 100+ |
546
+ | 13 | Wu et al. β€” Predicting Winning Price with Censored Data | β€” | KDD | 2015 | 101 |
547
+ | 14 | β€” Deep Censored Learning of Winning Price | β€” | WWW | 2019 | Well-cited |
548
+ | 15 | Katzman et al. β€” DeepSurv | β€” | BMC | 2018 | 1000+ |
549
+ | 16 | β€” TorchSurv | 2404.10761 | β€” | 2024 | New |
550
+ | 17 | β€” Robust Budget Pacing with a Single Sample | 2302.02006 | β€” | 2023 | Growing |
551
+ | 18 | β€” Multi-Channel Autobidding with Budget and ROI | 2302.01523 | β€” | 2023 | Growing |
552
+ | 19 | β€” No-Regret in Repeated FPA with Budgets | 2205.14572 | β€” | 2022 | 14 |
553
+ | 20 | β€” Dynamic Budget Throttling | 2207.04690 | β€” | 2022 | 6 |
554
+ | 21 | β€” AIGB: Generative Auto-bidding | 2405.16141 | β€” | 2024 | New |
555
+ | 22 | β€” Adaptive Bidding under Non-Stationarity | 2505.02796 | β€” | 2025 | 2 |
556
+ | 23 | β€” Joint Value Estimation and Bidding | 2502.17292 | β€” | 2025 | 4 |
557
+ | 24 | β€” Leveraging Hints: Adaptive Bidding | 2211.06358 | β€” | 2022 | 13 |
558
+ | 25 | Zhou et al. β€” DIN | 1706.06978 | KDD | 2018 | 2000+ |
559
+ | 26 | Zhou et al. β€” DIEN | 1809.03672 | AAAI | 2019 | 1000+ |
560
+ | 27 | Lian et al. β€” xDeepFM | 1803.05170 | KDD | 2018 | 1000+ |
561
+ | 28 | Song et al. β€” AutoInt | 1810.11921 | CIKM | 2019 | 500+ |
562
+ | 29 | Naumov et al. β€” DLRM (Meta) | 1906.00091 | β€” | 2019 | 500+ |
563
+ | 30 | Cheng et al. β€” Wide & Deep | 1606.07792 | RecSys | 2016 | 4000+ |
564
+ | 31 | McMahan et al. β€” Ad Click Prediction (FTRL) | β€” | KDD | 2013 | 2000+ |
565
+ | 32 | Zhang et al. β€” Optimal RTB for Display Advertising | β€” | KDD | 2014 | 500+ |