Bidding Algorithms Benchmark β Research Resources
First-Price Auction Focus | Generated: 2026-05-05
Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark
1. Bidding Algorithms for First-Price Auctions
1.1 DualOGD β Lagrangian Dual + Online Gradient Descent β PRIMARY
| Property |
Detail |
| Paper |
"Learning to Bid in Repeated First-Price Auctions with Budgets" |
| Authors |
Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
| Venue |
2023 |
| arXiv |
2304.13477 |
| HF Papers |
https://huggingface.co/papers/2304.13477 |
| Algorithm |
DualOGD β Lagrangian dual multiplier updated by online error gradient descent |
| Auction Type |
First-price |
| Constraints |
Budget cap: total spend β€ ΟT |
| Regret Bound |
Γ(βT) for both full-information and one-sided feedback |
| Key Formula |
Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t β Ρ·(Ο β cΜ_t(b_t))) |
| Bid Rule |
b_t = argmax_b (rΜ_t(v_t, b) β Ξ»_tΒ·cΜ_t(b)) |
| Prediction Models |
CTR predictor (v_t) + empirical CDF of competing bids (GΜ_t) |
| Code Pattern (Wang 2023, Algorithm 1) |
|
Ξ» = 0.0; Ξ΅ = 1.0 / sqrt(T); Ο = B / T
for t in range(T):
v_t = pCTR(features_t) * click_value
r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
c_tilde = lambda b: mean([b for d in d_history if b >= d])
b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b))
won = (b_t >= d_t); cost = b_t if won else 0
Ξ» = max(0, Ξ» - Ξ΅ * (Ο - c_tilde(b_t)))
1.2 TwoSidedDual β Budget Cap + Spend Floor
| Property |
Detail |
| Base |
Extension of Wang et al. (2023) |
| Constraints |
Total spend β€ B (cap) AND spend β₯ kΒ·B (floor) |
| Dual Variables |
ΞΌ for cap, Ξ½ for floor |
| Updates |
|
| ΞΌ_{t+1} = Proj_{ΞΌβ₯0}(ΞΌ_t β Ξ·βΒ·(Ο β cΜ_t(b_t))) |
cap penalty |
| Ξ½_{t+1} = Proj_{Ξ½β₯0}(Ξ½_t β Ξ·βΒ·(cΜ_t(b_t) β kΟ)) |
floor incentive |
| Bid Rule |
b_t = argmax_b (rΜ_t(v_t, b) β (ΞΌ_t β Ξ½_t)Β·cΜ_t(b)) |
| Key Insight |
When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). |
1.3 Adversarial Bidding β Non-Stationary Environments
| Property |
Detail |
| Paper |
"Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
| arXiv |
2505.02796 |
| Algorithm |
Adaptive dual OGD with change-point detection |
| Key Insight |
When distribution shifts, resets dual multiplier and restarts learning |
1.4 Contextual First-Price (2026)
| Property |
Detail |
| Paper |
"Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
| arXiv |
2603.07207 |
| Algorithm |
Dual OGD + quantile-based contextual censored regression |
| Key Innovation |
Extends Wang to contextual (feature-based) auctions |
1.5 Joint Value Estimation + Bidding
| Property |
Detail |
| Paper |
"Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
| arXiv |
2502.17292 |
| Key Insight |
Simultaneously learn CTR and bidding strategy β no separate CTR model training phase |
1.6 RLB β Reinforcement Learning (Baseline)
| Property |
Detail |
| Paper |
"Real-Time Bidding by Reinforcement Learning in Display Advertising" |
| Authors |
Han Cai et al., WSDM 2017 |
| arXiv |
1701.02490 |
| GitHub |
https://github.com/han-cai/rlb-dp (188 stars) |
| Algorithm |
MDP + Dynamic Programming + Neural value function |
| State |
(remaining auctions, remaining budget, features) |
| Action |
bid price a β [0, budget] |
| Prediction Models |
CTR ΞΈ(x) + market price distribution m(Ξ΄, x) |
1.7 Static Baselines
| Algorithm |
Bid Rule |
Notes |
| Linear |
b = base_bid Γ (pCTR / avg_pCTR) |
Simple proportional |
| Threshold |
b = fixed_bid if pCTR > Ο else 0 |
Binary decision |
| ValueShading |
b = v_t / (1 + Ξ») |
From second-price literature, adapted |
Algorithm Comparison Matrix
| Algorithm |
Adaptive? |
Market Price Model? |
Two-Sided? |
Provable Regret? |
Complexity |
| DualOGD |
β
Online |
Empirical CDF |
β (cap only) |
β
Γ(βT) |
Medium |
| TwoSidedDual |
β
Online |
Empirical CDF |
β
(cap+floor) |
β (heuristic) |
Medium |
| RLB |
β
DP |
Neural dist. model |
β |
β |
High |
| Linear |
β |
None |
β |
β |
Minimal |
| Threshold |
β |
None |
β |
β |
Minimal |
2. CTR Prediction Models
2.1 FinalMLP β RECOMMENDED
| Property |
Detail |
| Paper |
"FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
| Authors |
Kelong Mao et al., AAAI 2023 |
| arXiv |
2304.00902 |
| Criteo AUC |
0.8149 |
| Architecture |
Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
| Why Best for RTB |
Pure feed-forward MLP β <1ms inference, no attention/RNN overhead |
| Library |
FuxiCTR (pip install fuxictr) or DeepCTR-Torch |
2.2 DeepFM β Simple Baseline
| Property |
Detail |
| Paper |
"DeepFM: A Factorization-Machine based Neural Network" |
| Criteo AUC |
0.8138 |
| Architecture |
Shared embedding β FM (2nd-order) + DNN β sum β sigmoid |
2.3 DCNv2 β Industry Standard
| Property |
Detail |
| Paper |
"DCN V2: Improved Deep & Cross Network" (WWW 2021) |
| arXiv |
2008.13535 |
| Criteo AUC |
0.8142-0.8144 |
| Architecture |
CrossNetV2 (low-rank) + DNN in parallel |
2.4 BARS Meta-Finding β οΈ IMPORTANT
| Property |
Detail |
| Paper |
"BARS-CTR: Open Benchmarking" [2009.05794] |
| Finding |
After 7,000+ experiments: differences between SOTA CTR models are β€0.1-0.3% AUC |
| Implication |
Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |
CTR Model Comparison for RTB
| Model |
AUC (Criteo) |
Inference Speed |
RTB Latency OK? |
| FinalMLP |
0.8149 |
βββββ |
β
Yes |
| DCNv2 |
0.8142 |
ββββ |
β
Yes |
| DeepFM |
0.8138 |
ββββ |
β
Yes |
| GDCN |
0.8161* |
ββββ |
β
Yes |
| DIN |
β |
ββ |
β No |
| DIEN |
β |
β |
β No |
*Own data split, not directly comparable.
3. Clearing Price / Win Probability Prediction
3.1 Empirical CDF (Non-Parametric) β BASELINE
| Property |
Detail |
| Source |
Wang et al. (2023), Algorithm 1, Section 3.1 |
| Method |
Maintain array of observed competing bids d_s |
| Win Probability |
P(win |
| Expected Cost |
E[cost |
| Expected Reward |
rΜ_t(v,b) = (v - b) Β· GΜ_t(b) |
| Expected Cost (dual) |
cΜ_t(b) = b Β· GΜ_t(b) |
| Pros |
No training, theoretically sound, adapts online |
| Cons |
No context, cold-start issue, requires full feedback |
class EmpiricalCDF:
def __init__(self):
self.competing_bids = []
def update(self, d_t):
"""d_t = maximum competing bid (observed under full feedback)"""
self.competing_bids.append(d_t)
def win_prob(self, b):
if not self.competing_bids:
return 0.5
return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
def expected_cost(self, b):
wins = [d for d in self.competing_bids if b >= d]
if not wins:
return b
return np.mean(wins)
3.2 TorchSurv β Deep Censored Learning
| Property |
Detail |
| Library |
TorchSurv (Novartis, 200β
) |
| Paper |
2404.10761 |
| GitHub |
https://github.com/Novartis/torchsurv |
| Install |
pip install torchsurv |
| Method |
Neural network with Cox PH or Weibull AFT loss |
| Censoring |
Win = uncensored (exact price), Loss = right-censored (price > bid) |
| Output |
Survival function S(b |
| Win Prob |
P(win |
from torchsurv.loss import cox
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
3.3 Win Probability NN (Simplest ML)
| Property |
Detail |
| Method |
Binary classifier: P(win |
| Pros |
Dead simple, BCELoss |
| Cons |
Ignores price magnitude info when winning |
| Architecture |
features β bid_price β MLP β sigmoid |
4. Datasets
Verified on HuggingFace Hub
| Dataset |
HF Path |
Rows |
Features |
Label |
Status |
| Criteo_x4 |
reczoo/Criteo_x4 |
45.8M |
13 dense + 26 cat |
Label |
β
Ready |
| Avazu_x4 |
reczoo/Avazu_x4 |
40.4M |
22 fields |
click |
β
Ready |
RTB-Specific (External Only)
| Dataset |
Description |
Availability |
| iPinYou |
19.5M impressions, 9 campaigns, market prices included |
data.computational-advertising.org |
| YOYI |
~400M bid log records |
Various mirrors |
Key Gap: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use synthetic market price generation conditioned on Criteo features for evaluation.
5. Codebases & Libraries
| Library |
URL |
Use |
| FuxiCTR |
github.com/reczoo/FuxiCTR |
40+ CTR models, config-driven |
| DeepCTR-Torch |
github.com/shenweichen/DeepCTR-Torch |
20+ CTR models |
| TorchSurv |
github.com/Novartis/torchsurv |
Survival analysis for clearing price |
| BARS |
github.com/openbenchmark/BARS |
Standardized CTR benchmark |
| rlb-dp |
github.com/han-cai/rlb-dp |
RL for RTB (baseline reference) |
6. Recommended Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β FIRST-PRICE BIDDING ENGINE β
β β
β Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(Ο - cΜ_t(b_t))) β
β Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables β
β β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β PREDICTION MODELS β
β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β CTR: FinalMLP β β Win Prob: Empirical CDF β β
β β v_t = pCTR Γ V β β GΜ_t(b) = frac of d_s β€ b β β
β βββββββββββββββββββ ββββββββββββββββββββββββββββ β
β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β Optional: TorchSurv for contextual win prob β β
β β P(win|b,x) = 1 - S(b|x) β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β DATASETS β
β Criteo_x4 (CTR training) + synthetic market prices β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Paper Index
| # |
Paper |
arXiv |
Focus |
| 1 |
Wang et al. β Learning to Bid in Repeated FPA with Budgets |
2304.13477 |
β Primary algorithm |
| 2 |
β Adaptive Bidding under Non-Stationarity |
2505.02796 |
Distribution shift |
| 3 |
β Contextual First-Price (Quantile) |
2603.07207 |
Contextual extension |
| 4 |
β Joint Value Estimation and Bidding |
2502.17292 |
Simultaneous CTR + bidding |
| 5 |
β No-Regret in Repeated FPA with Budgets |
2205.14572 |
General framework |
| 6 |
Cai et al. β RLB |
1701.02490 |
RL baseline |
| 7 |
β Leveraging Hints: Adaptive Bidding |
2211.06358 |
Hints/forecasts |
| 8 |
Mao et al. β FinalMLP |
2304.00902 |
CTR model |
| 9 |
Wang et al. β DCN V2 |
2008.13535 |
CTR model |
| 10 |
Guo et al. β DeepFM |
β |
CTR model |
| 11 |
Zhu et al. β BARS-CTR |
2009.05794 |
CTR benchmark |
| 12 |
Wu et al. β Censored Price Prediction |
β |
Clearing price |
| 13 |
β TorchSurv |
2404.10761 |
Survival analysis library |