File size: 14,409 Bytes
a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a 9f3c561 a40d07a | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 | # Bidding Algorithms Benchmark β Research Resources
> First-Price Auction Focus | Generated: 2026-05-05
> Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
---
## 1. Bidding Algorithms for First-Price Auctions
### 1.1 DualOGD β Lagrangian Dual + Online Gradient Descent β PRIMARY
| Property | Detail |
|----------|--------|
| **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
| **Venue** | 2023 |
| **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
| **HF Papers** | https://huggingface.co/papers/2304.13477 |
| **Algorithm** | DualOGD β Lagrangian dual multiplier updated by online error gradient descent |
| **Auction Type** | **First-price** |
| **Constraints** | Budget cap: total spend β€ ΟT |
| **Regret Bound** | Γ(βT) for both full-information and one-sided feedback |
| **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t β Ρ·(Ο β cΜ_t(b_t))) |
| **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β Ξ»_tΒ·cΜ_t(b)) |
| **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (GΜ_t) |
| **Code Pattern (Wang 2023, Algorithm 1)** |
```python
# Initialization
Ξ» = 0.0; Ξ΅ = 1.0 / sqrt(T); Ο = B / T
for t in range(T):
v_t = pCTR(features_t) * click_value # from CTR model
# Estimate reward and cost from historical competing bids
r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
c_tilde = lambda b: mean([b for d in d_history if b >= d])
# Bid: maximize cost-adjusted reward
b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b))
# Observe maximum competing bid d_t (full feedback)
won = (b_t >= d_t); cost = b_t if won else 0
# Online gradient descent on dual multiplier
Ξ» = max(0, Ξ» - Ξ΅ * (Ο - c_tilde(b_t)))
```
### 1.2 TwoSidedDual β Budget Cap + Spend Floor
| Property | Detail |
|----------|--------|
| **Base** | Extension of Wang et al. (2023) |
| **Constraints** | Total spend β€ B (cap) AND spend β₯ kΒ·B (floor) |
| **Dual Variables** | ΞΌ for cap, Ξ½ for floor |
| **Updates** |
| ΞΌ_{t+1} = Proj_{ΞΌβ₯0}(ΞΌ_t β Ξ·βΒ·(Ο β cΜ_t(b_t))) | cap penalty |
| Ξ½_{t+1} = Proj_{Ξ½β₯0}(Ξ½_t β Ξ·βΒ·(cΜ_t(b_t) β kΟ)) | floor incentive |
| **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β (ΞΌ_t β Ξ½_t)Β·cΜ_t(b)) |
| **Key Insight** | When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). |
### 1.3 Adversarial Bidding β Non-Stationary Environments
| Property | Detail |
|----------|--------|
| **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
| **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
| **Algorithm** | Adaptive dual OGD with change-point detection |
| **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |
### 1.4 Contextual First-Price (2026)
| Property | Detail |
|----------|--------|
| **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
| **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
| **Algorithm** | Dual OGD + quantile-based contextual censored regression |
| **Key Innovation** | Extends Wang to contextual (feature-based) auctions |
### 1.5 Joint Value Estimation + Bidding
| Property | Detail |
|----------|--------|
| **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
| **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
| **Key Insight** | Simultaneously learn CTR and bidding strategy β no separate CTR model training phase |
### 1.6 RLB β Reinforcement Learning (Baseline)
| Property | Detail |
|----------|--------|
| **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
| **Authors** | Han Cai et al., WSDM 2017 |
| **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
| **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
| **Algorithm** | MDP + Dynamic Programming + Neural value function |
| **State** | (remaining auctions, remaining budget, features) |
| **Action** | bid price a β [0, budget] |
| **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
### 1.7 Static Baselines
| Algorithm | Bid Rule | Notes |
|-----------|----------|-------|
| **Linear** | b = base_bid Γ (pCTR / avg_pCTR) | Simple proportional |
| **Threshold** | b = fixed_bid if pCTR > Ο else 0 | Binary decision |
| **ValueShading** | b = v_t / (1 + Ξ») | From second-price literature, adapted |
### Algorithm Comparison Matrix
| Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
|-----------|-----------|-------------------|------------|-----------------|------------|
| **DualOGD** | β
Online | Empirical CDF | β (cap only) | β
Γ(βT) | Medium |
| **TwoSidedDual** | β
Online | Empirical CDF | β
(cap+floor) | β (heuristic) | Medium |
| **RLB** | β
DP | Neural dist. model | β | β | High |
| **Linear** | β | None | β | β | Minimal |
| **Threshold** | β | None | β | β | Minimal |
---
## 2. CTR Prediction Models
### 2.1 FinalMLP β RECOMMENDED
| Property | Detail |
|----------|--------|
| **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
| **Authors** | Kelong Mao et al., AAAI 2023 |
| **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
| **Criteo AUC** | **0.8149** |
| **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
| **Why Best for RTB** | Pure feed-forward MLP β <1ms inference, no attention/RNN overhead |
| **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |
```python
# FinalMLP architecture
# Stream 1: MLP(features * gate_weights) β feature selection
# Stream 2: MLP(features * (1-gate_weights)) β complementary view
# Fusion: Bilinear(stream1_output, stream2_output) β sigmoid
```
### 2.2 DeepFM β Simple Baseline
| Property | Detail |
|----------|--------|
| **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
| **Criteo AUC** | 0.8138 |
| **Architecture** | Shared embedding β FM (2nd-order) + DNN β sum β sigmoid |
### 2.3 DCNv2 β Industry Standard
| Property | Detail |
|----------|--------|
| **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
| **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
| **Criteo AUC** | 0.8142-0.8144 |
| **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |
### 2.4 BARS Meta-Finding β οΈ IMPORTANT
| Property | Detail |
|----------|--------|
| **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
| **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are β€0.1-0.3% AUC** |
| **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |
### CTR Model Comparison for RTB
| Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
|-------|-------------|-----------------|-----------------|
| **FinalMLP** | 0.8149 | βββββ | β
Yes |
| DCNv2 | 0.8142 | ββββ | β
Yes |
| DeepFM | 0.8138 | ββββ | β
Yes |
| GDCN | 0.8161* | ββββ | β
Yes |
| DIN | β | ββ | β No |
| DIEN | β | β | β No |
*Own data split, not directly comparable.
---
## 3. Clearing Price / Win Probability Prediction
### 3.1 Empirical CDF (Non-Parametric) β BASELINE
| Property | Detail |
|----------|--------|
| **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
| **Method** | Maintain array of observed competing bids d_s |
| **Win Probability** | P(win|b) = GΜ_t(b) = (1/(t-1))β_{s=1}^{t-1} π{b β₯ d_s} |
| **Expected Cost** | E[cost|win,b] = (1/GΜ_t(b)) Β· mean({d_s : d_s β€ b}) |
| **Expected Reward** | rΜ_t(v,b) = (v - b) Β· GΜ_t(b) |
| **Expected Cost (dual)** | cΜ_t(b) = b Β· GΜ_t(b) |
| **Pros** | No training, theoretically sound, adapts online |
| **Cons** | No context, cold-start issue, requires full feedback |
```python
class EmpiricalCDF:
def __init__(self):
self.competing_bids = []
def update(self, d_t):
"""d_t = maximum competing bid (observed under full feedback)"""
self.competing_bids.append(d_t)
def win_prob(self, b):
if not self.competing_bids:
return 0.5
return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
def expected_cost(self, b):
wins = [d for d in self.competing_bids if b >= d]
if not wins:
return b
return np.mean(wins)
```
### 3.2 TorchSurv β Deep Censored Learning
| Property | Detail |
|----------|--------|
| **Library** | TorchSurv (Novartis, 200β
) |
| **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
| **GitHub** | https://github.com/Novartis/torchsurv |
| **Install** | `pip install torchsurv` |
| **Method** | Neural network with Cox PH or Weibull AFT loss |
| **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
| **Output** | Survival function S(b|x) = P(market_price > b | features) |
| **Win Prob** | P(win|b,x) = 1 β S(b|x) |
```python
from torchsurv.loss import cox
# log_hazard = model(features) # shape (batch,)
# event = 1 if won (uncensored), 0 if lost (censored)
# time = market_price if won, bid if lost
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
```
### 3.3 Win Probability NN (Simplest ML)
| Property | Detail |
|----------|--------|
| **Method** | Binary classifier: P(win | bid_price, features) |
| **Pros** | Dead simple, BCELoss |
| **Cons** | Ignores price magnitude info when winning |
| **Architecture** | features β bid_price β MLP β sigmoid |
---
## 4. Datasets
### Verified on HuggingFace Hub
| Dataset | HF Path | Rows | Features | Label | Status |
|---------|---------|------|----------|-------|--------|
| **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | β
Ready |
| Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | β
Ready |
### RTB-Specific (External Only)
| Dataset | Description | Availability |
|---------|-------------|-------------|
| iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
| YOYI | ~400M bid log records | Various mirrors |
**Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.
---
## 5. Codebases & Libraries
| Library | URL | Use |
|---------|-----|-----|
| FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
| DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
| TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
| BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
| rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |
---
## 6. Recommended Architecture
```
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FIRST-PRICE BIDDING ENGINE β
β β
β Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(Ο - cΜ_t(b_t))) β
β Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PREDICTION MODELS β
β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β CTR: FinalMLP β β Win Prob: Empirical CDF β β
β β v_t = pCTR Γ V β β GΜ_t(b) = frac of d_s β€ b β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Optional: TorchSurv for contextual win prob β β
β β P(win|b,x) = 1 - S(b|x) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β DATASETS β
β Criteo_x4 (CTR training) + synthetic market prices β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
---
## Paper Index
| # | Paper | arXiv | Focus |
|---|-------|-------|-------|
| 1 | Wang et al. β Learning to Bid in Repeated FPA with Budgets | 2304.13477 | β Primary algorithm |
| 2 | β Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
| 3 | β Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
| 4 | β Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
| 5 | β No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
| 6 | Cai et al. β RLB | 1701.02490 | RL baseline |
| 7 | β Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
| 8 | Mao et al. β FinalMLP | 2304.00902 | CTR model |
| 9 | Wang et al. β DCN V2 | 2008.13535 | CTR model |
| 10 | Guo et al. β DeepFM | β | CTR model |
| 11 | Zhu et al. β BARS-CTR | 2009.05794 | CTR benchmark |
| 12 | Wu et al. β Censored Price Prediction | β | Clearing price |
| 13 | β TorchSurv | 2404.10761 | Survival analysis library |
|