# Bidding Algorithms Benchmark — Research Resources > First-Price Auction Focus | Generated: 2026-05-05 > Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark --- ## 1. Bidding Algorithms for First-Price Auctions ### 1.1 DualOGD — Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY | Property | Detail | |----------|--------| | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" | | **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong | | **Venue** | 2023 | | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) | | **HF Papers** | https://huggingface.co/papers/2304.13477 | | **Algorithm** | DualOGD — Lagrangian dual multiplier updated by online error gradient descent | | **Auction Type** | **First-price** | | **Constraints** | Budget cap: total spend ≤ ρT | | **Regret Bound** | Õ(√T) for both full-information and one-sided feedback | | **Key Formula** | λ_{t+1} = Proj_{λ>0}(λ_t − ε·(ρ − c̃_t(b_t))) | | **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − λ_t·c̃_t(b)) | | **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (G̃_t) | | **Code Pattern (Wang 2023, Algorithm 1)** | ```python # Initialization λ = 0.0; ε = 1.0 / sqrt(T); ρ = B / T for t in range(T): v_t = pCTR(features_t) * click_value # from CTR model # Estimate reward and cost from historical competing bids r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d]) c_tilde = lambda b: mean([b for d in d_history if b >= d]) # Bid: maximize cost-adjusted reward b_t = argmax_b (r_tilde(v_t, b) - λ * c_tilde(b)) # Observe maximum competing bid d_t (full feedback) won = (b_t >= d_t); cost = b_t if won else 0 # Online gradient descent on dual multiplier λ = max(0, λ - ε * (ρ - c_tilde(b_t))) ``` ### 1.2 TwoSidedDual — Budget Cap + Spend Floor | Property | Detail | |----------|--------| | **Base** | Extension of Wang et al. (2023) | | **Constraints** | Total spend ≤ B (cap) AND spend ≥ k·B (floor) | | **Dual Variables** | μ for cap, ν for floor | | **Updates** | | μ_{t+1} = Proj_{μ≥0}(μ_t − η₁·(ρ − c̃_t(b_t))) | cap penalty | | ν_{t+1} = Proj_{ν≥0}(ν_t − η₂·(c̃_t(b_t) − kρ)) | floor incentive | | **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − (μ_t − ν_t)·c̃_t(b)) | | **Key Insight** | When μ > ν: bidding is restrained (ahead on spend). When ν > μ: bidding is encouraged (behind on spend floor). | ### 1.3 Adversarial Bidding — Non-Stationary Environments | Property | Detail | |----------|--------| | **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" | | **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) | | **Algorithm** | Adaptive dual OGD with change-point detection | | **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning | ### 1.4 Contextual First-Price (2026) | Property | Detail | |----------|--------| | **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" | | **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) | | **Algorithm** | Dual OGD + quantile-based contextual censored regression | | **Key Innovation** | Extends Wang to contextual (feature-based) auctions | ### 1.5 Joint Value Estimation + Bidding | Property | Detail | |----------|--------| | **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" | | **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) | | **Key Insight** | Simultaneously learn CTR and bidding strategy — no separate CTR model training phase | ### 1.6 RLB — Reinforcement Learning (Baseline) | Property | Detail | |----------|--------| | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" | | **Authors** | Han Cai et al., WSDM 2017 | | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) | | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) | | **Algorithm** | MDP + Dynamic Programming + Neural value function | | **State** | (remaining auctions, remaining budget, features) | | **Action** | bid price a ∈ [0, budget] | | **Prediction Models** | CTR θ(x) + market price distribution m(δ, x) | ### 1.7 Static Baselines | Algorithm | Bid Rule | Notes | |-----------|----------|-------| | **Linear** | b = base_bid × (pCTR / avg_pCTR) | Simple proportional | | **Threshold** | b = fixed_bid if pCTR > τ else 0 | Binary decision | | **ValueShading** | b = v_t / (1 + λ) | From second-price literature, adapted | ### Algorithm Comparison Matrix | Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity | |-----------|-----------|-------------------|------------|-----------------|------------| | **DualOGD** | ✅ Online | Empirical CDF | ❌ (cap only) | ✅ Õ(√T) | Medium | | **TwoSidedDual** | ✅ Online | Empirical CDF | ✅ (cap+floor) | ❌ (heuristic) | Medium | | **RLB** | ✅ DP | Neural dist. model | ❌ | ❌ | High | | **Linear** | ❌ | None | ❌ | ❌ | Minimal | | **Threshold** | ❌ | None | ❌ | ❌ | Minimal | --- ## 2. CTR Prediction Models ### 2.1 FinalMLP ⭐ RECOMMENDED | Property | Detail | |----------|--------| | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" | | **Authors** | Kelong Mao et al., AAAI 2023 | | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) | | **Criteo AUC** | **0.8149** | | **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion | | **Why Best for RTB** | Pure feed-forward MLP — <1ms inference, no attention/RNN overhead | | **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch | ```python # FinalMLP architecture # Stream 1: MLP(features * gate_weights) → feature selection # Stream 2: MLP(features * (1-gate_weights)) → complementary view # Fusion: Bilinear(stream1_output, stream2_output) → sigmoid ``` ### 2.2 DeepFM — Simple Baseline | Property | Detail | |----------|--------| | **Paper** | "DeepFM: A Factorization-Machine based Neural Network" | | **Criteo AUC** | 0.8138 | | **Architecture** | Shared embedding → FM (2nd-order) + DNN → sum → sigmoid | ### 2.3 DCNv2 — Industry Standard | Property | Detail | |----------|--------| | **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) | | **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) | | **Criteo AUC** | 0.8142-0.8144 | | **Architecture** | CrossNetV2 (low-rank) + DNN in parallel | ### 2.4 BARS Meta-Finding ⚠️ IMPORTANT | Property | Detail | |----------|--------| | **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] | | **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are ≤0.1-0.3% AUC** | | **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering | ### CTR Model Comparison for RTB | Model | AUC (Criteo) | Inference Speed | RTB Latency OK? | |-------|-------------|-----------------|-----------------| | **FinalMLP** | 0.8149 | ⭐⭐⭐⭐⭐ | ✅ Yes | | DCNv2 | 0.8142 | ⭐⭐⭐⭐ | ✅ Yes | | DeepFM | 0.8138 | ⭐⭐⭐⭐ | ✅ Yes | | GDCN | 0.8161* | ⭐⭐⭐⭐ | ✅ Yes | | DIN | — | ⭐⭐ | ❌ No | | DIEN | — | ⭐ | ❌ No | *Own data split, not directly comparable. --- ## 3. Clearing Price / Win Probability Prediction ### 3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE | Property | Detail | |----------|--------| | **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 | | **Method** | Maintain array of observed competing bids d_s | | **Win Probability** | P(win|b) = G̃_t(b) = (1/(t-1))∑_{s=1}^{t-1} 𝟙{b ≥ d_s} | | **Expected Cost** | E[cost|win,b] = (1/G̃_t(b)) · mean({d_s : d_s ≤ b}) | | **Expected Reward** | r̃_t(v,b) = (v - b) · G̃_t(b) | | **Expected Cost (dual)** | c̃_t(b) = b · G̃_t(b) | | **Pros** | No training, theoretically sound, adapts online | | **Cons** | No context, cold-start issue, requires full feedback | ```python class EmpiricalCDF: def __init__(self): self.competing_bids = [] def update(self, d_t): """d_t = maximum competing bid (observed under full feedback)""" self.competing_bids.append(d_t) def win_prob(self, b): if not self.competing_bids: return 0.5 return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids]) def expected_cost(self, b): wins = [d for d in self.competing_bids if b >= d] if not wins: return b return np.mean(wins) ``` ### 3.2 TorchSurv — Deep Censored Learning | Property | Detail | |----------|--------| | **Library** | TorchSurv (Novartis, 200★) | | **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) | | **GitHub** | https://github.com/Novartis/torchsurv | | **Install** | `pip install torchsurv` | | **Method** | Neural network with Cox PH or Weibull AFT loss | | **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) | | **Output** | Survival function S(b|x) = P(market_price > b | features) | | **Win Prob** | P(win|b,x) = 1 − S(b|x) | ```python from torchsurv.loss import cox # log_hazard = model(features) # shape (batch,) # event = 1 if won (uncensored), 0 if lost (censored) # time = market_price if won, bid if lost loss = cox.neg_partial_log_likelihood(log_hazard, event, time) ``` ### 3.3 Win Probability NN (Simplest ML) | Property | Detail | |----------|--------| | **Method** | Binary classifier: P(win | bid_price, features) | | **Pros** | Dead simple, BCELoss | | **Cons** | Ignores price magnitude info when winning | | **Architecture** | features ⊕ bid_price → MLP → sigmoid | --- ## 4. Datasets ### Verified on HuggingFace Hub | Dataset | HF Path | Rows | Features | Label | Status | |---------|---------|------|----------|-------|--------| | **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | ✅ Ready | | Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | ✅ Ready | ### RTB-Specific (External Only) | Dataset | Description | Availability | |---------|-------------|-------------| | iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org | | YOYI | ~400M bid log records | Various mirrors | **Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation. --- ## 5. Codebases & Libraries | Library | URL | Use | |---------|-----|-----| | FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven | | DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models | | TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price | | BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark | | rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) | --- ## 6. Recommended Architecture ``` ┌──────────────────────────────────────────────────────┐ │ FIRST-PRICE BIDDING ENGINE │ │ │ │ Dual OGD: λ_{t+1} = max(0, λ_t - ε·(ρ - c̃_t(b_t))) │ │ Two-Sided: μ (cap) + ν (floor) dual variables │ │ │ ├──────────────────────────────────────────────────────┤ │ PREDICTION MODELS │ │ │ │ ┌─────────────────┐ ┌──────────────────────────┐ │ │ │ CTR: FinalMLP │ │ Win Prob: Empirical CDF │ │ │ │ v_t = pCTR × V │ │ G̃_t(b) = frac of d_s ≤ b │ │ │ └─────────────────┘ └──────────────────────────┘ │ │ │ │ ┌──────────────────────────────────────────────┐ │ │ │ Optional: TorchSurv for contextual win prob │ │ │ │ P(win|b,x) = 1 - S(b|x) │ │ │ └──────────────────────────────────────────────┘ │ ├──────────────────────────────────────────────────────┤ │ DATASETS │ │ Criteo_x4 (CTR training) + synthetic market prices │ └──────────────────────────────────────────────────────┘ ``` --- ## Paper Index | # | Paper | arXiv | Focus | |---|-------|-------|-------| | 1 | Wang et al. — Learning to Bid in Repeated FPA with Budgets | 2304.13477 | ⭐ Primary algorithm | | 2 | — Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift | | 3 | — Contextual First-Price (Quantile) | 2603.07207 | Contextual extension | | 4 | — Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding | | 5 | — No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework | | 6 | Cai et al. — RLB | 1701.02490 | RL baseline | | 7 | — Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts | | 8 | Mao et al. — FinalMLP | 2304.00902 | CTR model | | 9 | Wang et al. — DCN V2 | 2008.13535 | CTR model | | 10 | Guo et al. — DeepFM | — | CTR model | | 11 | Zhu et al. — BARS-CTR | 2009.05794 | CTR benchmark | | 12 | Wu et al. — Censored Price Prediction | — | Clearing price | | 13 | — TorchSurv | 2404.10761 | Survival analysis library |