| # Bidding Algorithms Benchmark β Research Resources |
|
|
| > First-Price Auction Focus | Generated: 2026-05-05 |
| > Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark |
|
|
| --- |
|
|
| ## 1. Bidding Algorithms for First-Price Auctions |
|
|
| ### 1.1 DualOGD β Lagrangian Dual + Online Gradient Descent β PRIMARY |
|
|
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" | |
| | **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong | |
| | **Venue** | 2023 | |
| | **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) | |
| | **HF Papers** | https://huggingface.co/papers/2304.13477 | |
| | **Algorithm** | DualOGD β Lagrangian dual multiplier updated by online error gradient descent | |
| | **Auction Type** | **First-price** | |
| | **Constraints** | Budget cap: total spend β€ ΟT | |
| | **Regret Bound** | Γ(βT) for both full-information and one-sided feedback | |
| | **Key Formula** | Ξ»_{t+1} = Proj_{Ξ»>0}(Ξ»_t β Ρ·(Ο β cΜ_t(b_t))) | |
| | **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β Ξ»_tΒ·cΜ_t(b)) | |
| | **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (GΜ_t) | |
| | **Code Pattern (Wang 2023, Algorithm 1)** | |
| ```python |
| # Initialization |
| Ξ» = 0.0; Ξ΅ = 1.0 / sqrt(T); Ο = B / T |
| |
| for t in range(T): |
| v_t = pCTR(features_t) * click_value # from CTR model |
| |
| # Estimate reward and cost from historical competing bids |
| r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d]) |
| c_tilde = lambda b: mean([b for d in d_history if b >= d]) |
| |
| # Bid: maximize cost-adjusted reward |
| b_t = argmax_b (r_tilde(v_t, b) - Ξ» * c_tilde(b)) |
| |
| # Observe maximum competing bid d_t (full feedback) |
| won = (b_t >= d_t); cost = b_t if won else 0 |
| |
| # Online gradient descent on dual multiplier |
| Ξ» = max(0, Ξ» - Ξ΅ * (Ο - c_tilde(b_t))) |
| ``` |
| |
| ### 1.2 TwoSidedDual β Budget Cap + Spend Floor |
|
|
| | Property | Detail | |
| |----------|--------| |
| | **Base** | Extension of Wang et al. (2023) | |
| | **Constraints** | Total spend β€ B (cap) AND spend β₯ kΒ·B (floor) | |
| | **Dual Variables** | ΞΌ for cap, Ξ½ for floor | |
| | **Updates** | |
| | ΞΌ_{t+1} = Proj_{ΞΌβ₯0}(ΞΌ_t β Ξ·βΒ·(Ο β cΜ_t(b_t))) | cap penalty | |
| | Ξ½_{t+1} = Proj_{Ξ½β₯0}(Ξ½_t β Ξ·βΒ·(cΜ_t(b_t) β kΟ)) | floor incentive | |
| | **Bid Rule** | b_t = argmax_b (rΜ_t(v_t, b) β (ΞΌ_t β Ξ½_t)Β·cΜ_t(b)) | |
| | **Key Insight** | When ΞΌ > Ξ½: bidding is restrained (ahead on spend). When Ξ½ > ΞΌ: bidding is encouraged (behind on spend floor). | |
| |
| ### 1.3 Adversarial Bidding β Non-Stationary Environments |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" | |
| | **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) | |
| | **Algorithm** | Adaptive dual OGD with change-point detection | |
| | **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning | |
| |
| ### 1.4 Contextual First-Price (2026) |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" | |
| | **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) | |
| | **Algorithm** | Dual OGD + quantile-based contextual censored regression | |
| | **Key Innovation** | Extends Wang to contextual (feature-based) auctions | |
| |
| ### 1.5 Joint Value Estimation + Bidding |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" | |
| | **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) | |
| | **Key Insight** | Simultaneously learn CTR and bidding strategy β no separate CTR model training phase | |
| |
| ### 1.6 RLB β Reinforcement Learning (Baseline) |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" | |
| | **Authors** | Han Cai et al., WSDM 2017 | |
| | **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) | |
| | **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) | |
| | **Algorithm** | MDP + Dynamic Programming + Neural value function | |
| | **State** | (remaining auctions, remaining budget, features) | |
| | **Action** | bid price a β [0, budget] | |
| | **Prediction Models** | CTR ΞΈ(x) + market price distribution m(Ξ΄, x) | |
| |
| ### 1.7 Static Baselines |
| |
| | Algorithm | Bid Rule | Notes | |
| |-----------|----------|-------| |
| | **Linear** | b = base_bid Γ (pCTR / avg_pCTR) | Simple proportional | |
| | **Threshold** | b = fixed_bid if pCTR > Ο else 0 | Binary decision | |
| | **ValueShading** | b = v_t / (1 + Ξ») | From second-price literature, adapted | |
| |
| ### Algorithm Comparison Matrix |
| |
| | Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity | |
| |-----------|-----------|-------------------|------------|-----------------|------------| |
| | **DualOGD** | β
Online | Empirical CDF | β (cap only) | β
Γ(βT) | Medium | |
| | **TwoSidedDual** | β
Online | Empirical CDF | β
(cap+floor) | β (heuristic) | Medium | |
| | **RLB** | β
DP | Neural dist. model | β | β | High | |
| | **Linear** | β | None | β | β | Minimal | |
| | **Threshold** | β | None | β | β | Minimal | |
| |
| --- |
| |
| ## 2. CTR Prediction Models |
| |
| ### 2.1 FinalMLP β RECOMMENDED |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" | |
| | **Authors** | Kelong Mao et al., AAAI 2023 | |
| | **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) | |
| | **Criteo AUC** | **0.8149** | |
| | **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion | |
| | **Why Best for RTB** | Pure feed-forward MLP β <1ms inference, no attention/RNN overhead | |
| | **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch | |
| |
| ```python |
| # FinalMLP architecture |
| # Stream 1: MLP(features * gate_weights) β feature selection |
| # Stream 2: MLP(features * (1-gate_weights)) β complementary view |
| # Fusion: Bilinear(stream1_output, stream2_output) β sigmoid |
| ``` |
| |
| ### 2.2 DeepFM β Simple Baseline |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "DeepFM: A Factorization-Machine based Neural Network" | |
| | **Criteo AUC** | 0.8138 | |
| | **Architecture** | Shared embedding β FM (2nd-order) + DNN β sum β sigmoid | |
| |
| ### 2.3 DCNv2 β Industry Standard |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) | |
| | **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) | |
| | **Criteo AUC** | 0.8142-0.8144 | |
| | **Architecture** | CrossNetV2 (low-rank) + DNN in parallel | |
| |
| ### 2.4 BARS Meta-Finding β οΈ IMPORTANT |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] | |
| | **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are β€0.1-0.3% AUC** | |
| | **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering | |
| |
| ### CTR Model Comparison for RTB |
| |
| | Model | AUC (Criteo) | Inference Speed | RTB Latency OK? | |
| |-------|-------------|-----------------|-----------------| |
| | **FinalMLP** | 0.8149 | βββββ | β
Yes | |
| | DCNv2 | 0.8142 | ββββ | β
Yes | |
| | DeepFM | 0.8138 | ββββ | β
Yes | |
| | GDCN | 0.8161* | ββββ | β
Yes | |
| | DIN | β | ββ | β No | |
| | DIEN | β | β | β No | |
| |
| *Own data split, not directly comparable. |
| |
| --- |
| |
| ## 3. Clearing Price / Win Probability Prediction |
| |
| ### 3.1 Empirical CDF (Non-Parametric) β BASELINE |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 | |
| | **Method** | Maintain array of observed competing bids d_s | |
| | **Win Probability** | P(win|b) = GΜ_t(b) = (1/(t-1))β_{s=1}^{t-1} π{b β₯ d_s} | |
| | **Expected Cost** | E[cost|win,b] = (1/GΜ_t(b)) Β· mean({d_s : d_s β€ b}) | |
| | **Expected Reward** | rΜ_t(v,b) = (v - b) Β· GΜ_t(b) | |
| | **Expected Cost (dual)** | cΜ_t(b) = b Β· GΜ_t(b) | |
| | **Pros** | No training, theoretically sound, adapts online | |
| | **Cons** | No context, cold-start issue, requires full feedback | |
|
|
| ```python |
| class EmpiricalCDF: |
| def __init__(self): |
| self.competing_bids = [] |
| |
| def update(self, d_t): |
| """d_t = maximum competing bid (observed under full feedback)""" |
| self.competing_bids.append(d_t) |
| |
| def win_prob(self, b): |
| if not self.competing_bids: |
| return 0.5 |
| return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids]) |
| |
| def expected_cost(self, b): |
| wins = [d for d in self.competing_bids if b >= d] |
| if not wins: |
| return b |
| return np.mean(wins) |
| ``` |
|
|
| ### 3.2 TorchSurv β Deep Censored Learning |
|
|
| | Property | Detail | |
| |----------|--------| |
| | **Library** | TorchSurv (Novartis, 200β
) | |
| | **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) | |
| | **GitHub** | https://github.com/Novartis/torchsurv | |
| | **Install** | `pip install torchsurv` | |
| | **Method** | Neural network with Cox PH or Weibull AFT loss | |
| | **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) | |
| | **Output** | Survival function S(b|x) = P(market_price > b | features) | |
| | **Win Prob** | P(win|b,x) = 1 β S(b|x) | |
| |
| ```python |
| from torchsurv.loss import cox |
| |
| # log_hazard = model(features) # shape (batch,) |
| # event = 1 if won (uncensored), 0 if lost (censored) |
| # time = market_price if won, bid if lost |
| loss = cox.neg_partial_log_likelihood(log_hazard, event, time) |
| ``` |
| |
| ### 3.3 Win Probability NN (Simplest ML) |
| |
| | Property | Detail | |
| |----------|--------| |
| | **Method** | Binary classifier: P(win | bid_price, features) | |
| | **Pros** | Dead simple, BCELoss | |
| | **Cons** | Ignores price magnitude info when winning | |
| | **Architecture** | features β bid_price β MLP β sigmoid | |
| |
| --- |
| |
| ## 4. Datasets |
| |
| ### Verified on HuggingFace Hub |
| |
| | Dataset | HF Path | Rows | Features | Label | Status | |
| |---------|---------|------|----------|-------|--------| |
| | **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | β
Ready | |
| | Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | β
Ready | |
|
|
| ### RTB-Specific (External Only) |
|
|
| | Dataset | Description | Availability | |
| |---------|-------------|-------------| |
| | iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org | |
| | YOYI | ~400M bid log records | Various mirrors | |
|
|
| **Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation. |
|
|
| --- |
|
|
| ## 5. Codebases & Libraries |
|
|
| | Library | URL | Use | |
| |---------|-----|-----| |
| | FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven | |
| | DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models | |
| | TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price | |
| | BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark | |
| | rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) | |
|
|
| --- |
|
|
| ## 6. Recommended Architecture |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β FIRST-PRICE BIDDING ENGINE β |
| β β |
| β Dual OGD: Ξ»_{t+1} = max(0, Ξ»_t - Ρ·(Ο - cΜ_t(b_t))) β |
| β Two-Sided: ΞΌ (cap) + Ξ½ (floor) dual variables β |
| β β |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ |
| β PREDICTION MODELS β |
| β β |
| β βββββββββββββββββββ ββββββββββββββββββββββββββββ β |
| β β CTR: FinalMLP β β Win Prob: Empirical CDF β β |
| β β v_t = pCTR Γ V β β GΜ_t(b) = frac of d_s β€ b β β |
| β βββββββββββββββββββ ββββββββββββββββββββββββββββ β |
| β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββ β |
| β β Optional: TorchSurv for contextual win prob β β |
| β β P(win|b,x) = 1 - S(b|x) β β |
| β ββββββββββββββββββββββββββββββββββββββββββββββββ β |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€ |
| β DATASETS β |
| β Criteo_x4 (CTR training) + synthetic market prices β |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## Paper Index |
|
|
| | # | Paper | arXiv | Focus | |
| |---|-------|-------|-------| |
| | 1 | Wang et al. β Learning to Bid in Repeated FPA with Budgets | 2304.13477 | β Primary algorithm | |
| | 2 | β Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift | |
| | 3 | β Contextual First-Price (Quantile) | 2603.07207 | Contextual extension | |
| | 4 | β Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding | |
| | 5 | β No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework | |
| | 6 | Cai et al. β RLB | 1701.02490 | RL baseline | |
| | 7 | β Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts | |
| | 8 | Mao et al. β FinalMLP | 2304.00902 | CTR model | |
| | 9 | Wang et al. β DCN V2 | 2008.13535 | CTR model | |
| | 10 | Guo et al. β DeepFM | β | CTR model | |
| | 11 | Zhu et al. β BARS-CTR | 2009.05794 | CTR benchmark | |
| | 12 | Wu et al. β Censored Price Prediction | β | Clearing price | |
| | 13 | β TorchSurv | 2404.10761 | Survival analysis library | |
|
|