File size: 14,409 Bytes
a40d07a
9f3c561
a40d07a
 
9f3c561
 
 
a40d07a
9f3c561
a40d07a
9f3c561
 
 
 
a40d07a
 
9f3c561
 
 
a40d07a
9f3c561
 
 
 
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f3c561
a40d07a
 
 
 
 
 
 
 
9f3c561
 
 
a40d07a
 
 
 
 
 
9f3c561
 
 
a40d07a
 
 
9f3c561
a40d07a
9f3c561
 
 
 
a40d07a
9f3c561
 
 
a40d07a
 
9f3c561
 
a40d07a
9f3c561
a40d07a
 
 
 
 
9f3c561
a40d07a
9f3c561
a40d07a
 
 
 
 
 
 
9f3c561
 
 
 
 
a40d07a
9f3c561
 
 
 
a40d07a
9f3c561
 
a40d07a
 
 
 
 
 
 
 
 
 
9f3c561
a40d07a
9f3c561
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f3c561
a40d07a
9f3c561
a40d07a
 
 
 
 
 
 
 
 
 
9f3c561
 
 
a40d07a
9f3c561
a40d07a
9f3c561
 
 
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f3c561
a40d07a
9f3c561
 
 
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f3c561
a40d07a
9f3c561
 
 
a40d07a
 
 
 
9f3c561
 
 
 
 
a40d07a
9f3c561
a40d07a
 
 
 
9f3c561
a40d07a
9f3c561
a40d07a
 
 
 
9f3c561
a40d07a
9f3c561
 
 
a40d07a
9f3c561
a40d07a
 
 
 
 
 
 
9f3c561
 
 
a40d07a
9f3c561
 
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9f3c561
 
a40d07a
 
9f3c561
 
a40d07a
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
# Bidding Algorithms Benchmark β€” Research Resources

> First-Price Auction Focus | Generated: 2026-05-05
> Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark

---

## 1. Bidding Algorithms for First-Price Auctions

### 1.1 DualOGD β€” Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY

| Property | Detail |
|----------|--------|
| **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
| **Venue** | 2023 |
| **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
| **HF Papers** | https://huggingface.co/papers/2304.13477 |
| **Algorithm** | DualOGD β€” Lagrangian dual multiplier updated by online error gradient descent |
| **Auction Type** | **First-price** |
| **Constraints** | Budget cap: total spend ≀ ρT |
| **Regret Bound** | Γ•(√T) for both full-information and one-sided feedback |
| **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t βˆ’ Ρ·(ρ βˆ’ cΜƒ_t(b_t))) |
| **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ Ξ»_tΒ·cΜƒ_t(b)) |
| **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (G̃_t) |
| **Code Pattern (Wang 2023, Algorithm 1)** |
```python
# Initialization
λ = 0.0; Ρ = 1.0 / sqrt(T); ρ = B / T

for t in range(T):
    v_t = pCTR(features_t) * click_value  # from CTR model
    
    # Estimate reward and cost from historical competing bids
    r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
    c_tilde = lambda b: mean([b for d in d_history if b >= d])
    
    # Bid: maximize cost-adjusted reward
    b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b))
    
    # Observe maximum competing bid d_t (full feedback)
    won = (b_t >= d_t); cost = b_t if won else 0
    
    # Online gradient descent on dual multiplier
    λ = max(0, λ - Ρ * (ρ - c_tilde(b_t)))
```

### 1.2 TwoSidedDual β€” Budget Cap + Spend Floor

| Property | Detail |
|----------|--------|
| **Base** | Extension of Wang et al. (2023) |
| **Constraints** | Total spend ≀ B (cap) AND spend β‰₯ kΒ·B (floor) |
| **Dual Variables** | ΞΌ for cap, Ξ½ for floor |
| **Updates** |
| ΞΌ_{t+1} = Proj_{ΞΌβ‰₯0}(ΞΌ_t βˆ’ η₁·(ρ βˆ’ cΜƒ_t(b_t))) | cap penalty |
| Ξ½_{t+1} = Proj_{Ξ½β‰₯0}(Ξ½_t βˆ’ Ξ·β‚‚Β·(cΜƒ_t(b_t) βˆ’ kρ)) | floor incentive |
| **Bid Rule** | b_t = argmax_b (rΜƒ_t(v_t, b) βˆ’ (ΞΌ_t βˆ’ Ξ½_t)Β·cΜƒ_t(b)) |
| **Key Insight** | When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). |

### 1.3 Adversarial Bidding β€” Non-Stationary Environments

| Property | Detail |
|----------|--------|
| **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
| **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
| **Algorithm** | Adaptive dual OGD with change-point detection |
| **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |

### 1.4 Contextual First-Price (2026)

| Property | Detail |
|----------|--------|
| **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
| **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
| **Algorithm** | Dual OGD + quantile-based contextual censored regression |
| **Key Innovation** | Extends Wang to contextual (feature-based) auctions |

### 1.5 Joint Value Estimation + Bidding

| Property | Detail |
|----------|--------|
| **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
| **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
| **Key Insight** | Simultaneously learn CTR and bidding strategy β€” no separate CTR model training phase |

### 1.6 RLB β€” Reinforcement Learning (Baseline)

| Property | Detail |
|----------|--------|
| **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
| **Authors** | Han Cai et al., WSDM 2017 |
| **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
| **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
| **Algorithm** | MDP + Dynamic Programming + Neural value function |
| **State** | (remaining auctions, remaining budget, features) |
| **Action** | bid price a ∈ [0, budget] |
| **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |

### 1.7 Static Baselines

| Algorithm | Bid Rule | Notes |
|-----------|----------|-------|
| **Linear** | b = base_bid Γ— (pCTR / avg_pCTR) | Simple proportional |
| **Threshold** | b = fixed_bid if pCTR > Ο„ else 0 | Binary decision |
| **ValueShading** | b = v_t / (1 + Ξ») | From second-price literature, adapted |

### Algorithm Comparison Matrix

| Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
|-----------|-----------|-------------------|------------|-----------------|------------|
| **DualOGD** | βœ… Online | Empirical CDF | ❌ (cap only) | βœ… Γ•(√T) | Medium |
| **TwoSidedDual** | βœ… Online | Empirical CDF | βœ… (cap+floor) | ❌ (heuristic) | Medium |
| **RLB** | βœ… DP | Neural dist. model | ❌ | ❌ | High |
| **Linear** | ❌ | None | ❌ | ❌ | Minimal |
| **Threshold** | ❌ | None | ❌ | ❌ | Minimal |

---

## 2. CTR Prediction Models

### 2.1 FinalMLP ⭐ RECOMMENDED

| Property | Detail |
|----------|--------|
| **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
| **Authors** | Kelong Mao et al., AAAI 2023 |
| **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
| **Criteo AUC** | **0.8149** |
| **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
| **Why Best for RTB** | Pure feed-forward MLP β€” <1ms inference, no attention/RNN overhead |
| **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |

```python
# FinalMLP architecture
# Stream 1: MLP(features * gate_weights) β†’ feature selection
# Stream 2: MLP(features * (1-gate_weights)) β†’ complementary view
# Fusion: Bilinear(stream1_output, stream2_output) β†’ sigmoid
```

### 2.2 DeepFM β€” Simple Baseline

| Property | Detail |
|----------|--------|
| **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
| **Criteo AUC** | 0.8138 |
| **Architecture** | Shared embedding β†’ FM (2nd-order) + DNN β†’ sum β†’ sigmoid |

### 2.3 DCNv2 β€” Industry Standard

| Property | Detail |
|----------|--------|
| **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
| **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
| **Criteo AUC** | 0.8142-0.8144 |
| **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |

### 2.4 BARS Meta-Finding ⚠️ IMPORTANT

| Property | Detail |
|----------|--------|
| **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
| **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are ≀0.1-0.3% AUC** |
| **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |

### CTR Model Comparison for RTB

| Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
|-------|-------------|-----------------|-----------------|
| **FinalMLP** | 0.8149 | ⭐⭐⭐⭐⭐ | βœ… Yes |
| DCNv2 | 0.8142 | ⭐⭐⭐⭐ | βœ… Yes |
| DeepFM | 0.8138 | ⭐⭐⭐⭐ | βœ… Yes |
| GDCN | 0.8161* | ⭐⭐⭐⭐ | βœ… Yes |
| DIN | β€” | ⭐⭐ | ❌ No |
| DIEN | β€” | ⭐ | ❌ No |

*Own data split, not directly comparable.

---

## 3. Clearing Price / Win Probability Prediction

### 3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE

| Property | Detail |
|----------|--------|
| **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
| **Method** | Maintain array of observed competing bids d_s |
| **Win Probability** | P(win|b) = GΜƒ_t(b) = (1/(t-1))βˆ‘_{s=1}^{t-1} πŸ™{b β‰₯ d_s} |
| **Expected Cost** | E[cost|win,b] = (1/GΜƒ_t(b)) Β· mean({d_s : d_s ≀ b}) |
| **Expected Reward** | r̃_t(v,b) = (v - b) · G̃_t(b) |
| **Expected Cost (dual)** | c̃_t(b) = b · G̃_t(b) |
| **Pros** | No training, theoretically sound, adapts online |
| **Cons** | No context, cold-start issue, requires full feedback |

```python
class EmpiricalCDF:
    def __init__(self):
        self.competing_bids = []
    
    def update(self, d_t):
        """d_t = maximum competing bid (observed under full feedback)"""
        self.competing_bids.append(d_t)
    
    def win_prob(self, b):
        if not self.competing_bids:
            return 0.5
        return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
    
    def expected_cost(self, b):
        wins = [d for d in self.competing_bids if b >= d]
        if not wins:
            return b
        return np.mean(wins)
```

### 3.2 TorchSurv β€” Deep Censored Learning

| Property | Detail |
|----------|--------|
| **Library** | TorchSurv (Novartis, 200β˜…) |
| **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
| **GitHub** | https://github.com/Novartis/torchsurv |
| **Install** | `pip install torchsurv` |
| **Method** | Neural network with Cox PH or Weibull AFT loss |
| **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
| **Output** | Survival function S(b|x) = P(market_price > b | features) |
| **Win Prob** | P(win|b,x) = 1 βˆ’ S(b|x) |

```python
from torchsurv.loss import cox

# log_hazard = model(features)  # shape (batch,)
# event = 1 if won (uncensored), 0 if lost (censored)
# time = market_price if won, bid if lost
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
```

### 3.3 Win Probability NN (Simplest ML)

| Property | Detail |
|----------|--------|
| **Method** | Binary classifier: P(win | bid_price, features) |
| **Pros** | Dead simple, BCELoss |
| **Cons** | Ignores price magnitude info when winning |
| **Architecture** | features βŠ• bid_price β†’ MLP β†’ sigmoid |

---

## 4. Datasets

### Verified on HuggingFace Hub

| Dataset | HF Path | Rows | Features | Label | Status |
|---------|---------|------|----------|-------|--------|
| **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | βœ… Ready |
| Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | βœ… Ready |

### RTB-Specific (External Only)

| Dataset | Description | Availability |
|---------|-------------|-------------|
| iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
| YOYI | ~400M bid log records | Various mirrors |

**Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.

---

## 5. Codebases & Libraries

| Library | URL | Use |
|---------|-----|-----|
| FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
| DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
| TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
| BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
| rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |

---

## 6. Recommended Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              FIRST-PRICE BIDDING ENGINE               β”‚
β”‚                                                       β”‚
β”‚  Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(ρ - cΜƒ_t(b_t))) β”‚
β”‚  Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables       β”‚
β”‚                                                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                PREDICTION MODELS                       β”‚
β”‚                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ CTR: FinalMLP   β”‚  β”‚ Win Prob: Empirical CDF   β”‚   β”‚
β”‚  β”‚ v_t = pCTR Γ— V  β”‚  β”‚ GΜƒ_t(b) = frac of d_s ≀ b β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                                                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”‚
β”‚  β”‚ Optional: TorchSurv for contextual win prob  β”‚    β”‚
β”‚  β”‚ P(win|b,x) = 1 - S(b|x)                      β”‚    β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                   DATASETS                            β”‚
β”‚  Criteo_x4 (CTR training) + synthetic market prices  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

---

## Paper Index

| # | Paper | arXiv | Focus |
|---|-------|-------|-------|
| 1 | Wang et al. β€” Learning to Bid in Repeated FPA with Budgets | 2304.13477 | ⭐ Primary algorithm |
| 2 | β€” Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
| 3 | β€” Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
| 4 | β€” Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
| 5 | β€” No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
| 6 | Cai et al. β€” RLB | 1701.02490 | RL baseline |
| 7 | β€” Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
| 8 | Mao et al. β€” FinalMLP | 2304.00902 | CTR model |
| 9 | Wang et al. β€” DCN V2 | 2008.13535 | CTR model |
| 10 | Guo et al. β€” DeepFM | β€” | CTR model |
| 11 | Zhu et al. β€” BARS-CTR | 2009.05794 | CTR benchmark |
| 12 | Wu et al. β€” Censored Price Prediction | β€” | Clearing price |
| 13 | β€” TorchSurv | 2404.10761 | Survival analysis library |