bidding_algorithms_benchmark / RESEARCH_RESOURCES.md

Upload RESEARCH_RESOURCES.md

a40d07a verified 3 days ago

preview code

raw

history blame contribute delete

14.4 kB

Bidding Algorithms Benchmark — Research Resources

First-Price Auction Focus | Generated: 2026-05-05 Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark

1. Bidding Algorithms for First-Price Auctions

1.1 DualOGD — Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY

Property	Detail
Paper	"Learning to Bid in Repeated First-Price Auctions with Budgets"
Authors	Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong
Venue	2023
arXiv	2304.13477
HF Papers	https://huggingface.co/papers/2304.13477
Algorithm	DualOGD — Lagrangian dual multiplier updated by online error gradient descent
Auction Type	First-price
Constraints	Budget cap: total spend ≤ ρT
Regret Bound	Õ(√T) for both full-information and one-sided feedback
Key Formula	λ_{t+1} = Proj_{λ>0}(λ_t − ε·(ρ − c̃_t(b_t)))
Bid Rule	b_t = argmax_b (r̃_t(v_t, b) − λ_t·c̃_t(b))
Prediction Models	CTR predictor (v_t) + empirical CDF of competing bids (G̃_t)
Code Pattern (Wang 2023, Algorithm 1)

# Initialization
λ = 0.0; ε = 1.0 / sqrt(T); ρ = B / T

for t in range(T):
    v_t = pCTR(features_t) * click_value  # from CTR model
    
    # Estimate reward and cost from historical competing bids
    r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
    c_tilde = lambda b: mean([b for d in d_history if b >= d])
    
    # Bid: maximize cost-adjusted reward
    b_t = argmax_b (r_tilde(v_t, b) - λ * c_tilde(b))
    
    # Observe maximum competing bid d_t (full feedback)
    won = (b_t >= d_t); cost = b_t if won else 0
    
    # Online gradient descent on dual multiplier
    λ = max(0, λ - ε * (ρ - c_tilde(b_t)))

1.2 TwoSidedDual — Budget Cap + Spend Floor

Property	Detail
Base	Extension of Wang et al. (2023)
Constraints	Total spend ≤ B (cap) AND spend ≥ k·B (floor)
Dual Variables	μ for cap, ν for floor
Updates
μ_{t+1} = Proj_{μ≥0}(μ_t − η₁·(ρ − c̃_t(b_t)))	cap penalty
ν_{t+1} = Proj_{ν≥0}(ν_t − η₂·(c̃_t(b_t) − kρ))	floor incentive
Bid Rule	b_t = argmax_b (r̃_t(v_t, b) − (μ_t − ν_t)·c̃_t(b))
Key Insight	When μ > ν: bidding is restrained (ahead on spend). When ν > μ: bidding is encouraged (behind on spend floor).

1.3 Adversarial Bidding — Non-Stationary Environments

Property	Detail
Paper	"Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity"
arXiv	2505.02796
Algorithm	Adaptive dual OGD with change-point detection
Key Insight	When distribution shifts, resets dual multiplier and restarts learning

1.4 Contextual First-Price (2026)

Property	Detail
Paper	"Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback"
arXiv	2603.07207
Algorithm	Dual OGD + quantile-based contextual censored regression
Key Innovation	Extends Wang to contextual (feature-based) auctions

1.5 Joint Value Estimation + Bidding

Property	Detail
Paper	"Joint Value Estimation and Bidding in Repeated First-Price Auctions"
arXiv	2502.17292
Key Insight	Simultaneously learn CTR and bidding strategy — no separate CTR model training phase

1.6 RLB — Reinforcement Learning (Baseline)

Property	Detail
Paper	"Real-Time Bidding by Reinforcement Learning in Display Advertising"
Authors	Han Cai et al., WSDM 2017
arXiv	1701.02490
GitHub	https://github.com/han-cai/rlb-dp (188 stars)
Algorithm	MDP + Dynamic Programming + Neural value function
State	(remaining auctions, remaining budget, features)
Action	bid price a ∈ [0, budget]
Prediction Models	CTR θ(x) + market price distribution m(δ, x)

1.7 Static Baselines

Algorithm	Bid Rule	Notes
Linear	b = base_bid × (pCTR / avg_pCTR)	Simple proportional
Threshold	b = fixed_bid if pCTR > τ else 0	Binary decision
ValueShading	b = v_t / (1 + λ)	From second-price literature, adapted

Algorithm Comparison Matrix

Algorithm	Adaptive?	Market Price Model?	Two-Sided?	Provable Regret?	Complexity
DualOGD	✅ Online	Empirical CDF	❌ (cap only)	✅ Õ(√T)	Medium
TwoSidedDual	✅ Online	Empirical CDF	✅ (cap+floor)	❌ (heuristic)	Medium
RLB	✅ DP	Neural dist. model	❌	❌	High
Linear	❌	None	❌	❌	Minimal
Threshold	❌	None	❌	❌	Minimal

2. CTR Prediction Models

2.1 FinalMLP ⭐ RECOMMENDED

Property	Detail
Paper	"FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction"
Authors	Kelong Mao et al., AAAI 2023
arXiv	2304.00902
Criteo AUC	0.8149
Architecture	Two independent MLP towers + feature gating (soft selection) + bilinear fusion
Why Best for RTB	Pure feed-forward MLP — <1ms inference, no attention/RNN overhead
Library	FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch

# FinalMLP architecture
# Stream 1: MLP(features * gate_weights) → feature selection
# Stream 2: MLP(features * (1-gate_weights)) → complementary view
# Fusion: Bilinear(stream1_output, stream2_output) → sigmoid

2.2 DeepFM — Simple Baseline

Property	Detail
Paper	"DeepFM: A Factorization-Machine based Neural Network"
Criteo AUC	0.8138
Architecture	Shared embedding → FM (2nd-order) + DNN → sum → sigmoid

2.3 DCNv2 — Industry Standard

Property	Detail
Paper	"DCN V2: Improved Deep & Cross Network" (WWW 2021)
arXiv	2008.13535
Criteo AUC	0.8142-0.8144
Architecture	CrossNetV2 (low-rank) + DNN in parallel

2.4 BARS Meta-Finding ⚠️ IMPORTANT

Property	Detail
Paper	"BARS-CTR: Open Benchmarking" [2009.05794]
Finding	After 7,000+ experiments: differences between SOTA CTR models are ≤0.1-0.3% AUC
Implication	Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering

CTR Model Comparison for RTB

Model	AUC (Criteo)	Inference Speed	RTB Latency OK?
FinalMLP	0.8149	⭐⭐⭐⭐⭐	✅ Yes
DCNv2	0.8142	⭐⭐⭐⭐	✅ Yes
DeepFM	0.8138	⭐⭐⭐⭐	✅ Yes
GDCN	0.8161*	⭐⭐⭐⭐	✅ Yes
DIN	—	⭐⭐	❌ No
DIEN	—	⭐	❌ No

*Own data split, not directly comparable.

3. Clearing Price / Win Probability Prediction

3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE

Property	Detail
Source	Wang et al. (2023), Algorithm 1, Section 3.1
Method	Maintain array of observed competing bids d_s
Win Probability	P(win
Expected Cost	E[cost
Expected Reward	r̃_t(v,b) = (v - b) · G̃_t(b)
Expected Cost (dual)	c̃_t(b) = b · G̃_t(b)
Pros	No training, theoretically sound, adapts online
Cons	No context, cold-start issue, requires full feedback

class EmpiricalCDF:
    def __init__(self):
        self.competing_bids = []
    
    def update(self, d_t):
        """d_t = maximum competing bid (observed under full feedback)"""
        self.competing_bids.append(d_t)
    
    def win_prob(self, b):
        if not self.competing_bids:
            return 0.5
        return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
    
    def expected_cost(self, b):
        wins = [d for d in self.competing_bids if b >= d]
        if not wins:
            return b
        return np.mean(wins)

3.2 TorchSurv — Deep Censored Learning

Property	Detail
Library	TorchSurv (Novartis, 200★)
Paper	2404.10761
GitHub	https://github.com/Novartis/torchsurv
Install	`pip install torchsurv`
Method	Neural network with Cox PH or Weibull AFT loss
Censoring	Win = uncensored (exact price), Loss = right-censored (price > bid)
Output	Survival function S(b
Win Prob	P(win

from torchsurv.loss import cox

# log_hazard = model(features)  # shape (batch,)
# event = 1 if won (uncensored), 0 if lost (censored)
# time = market_price if won, bid if lost
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)

3.3 Win Probability NN (Simplest ML)

Property	Detail
Method	Binary classifier: P(win
Pros	Dead simple, BCELoss
Cons	Ignores price magnitude info when winning
Architecture	features ⊕ bid_price → MLP → sigmoid

4. Datasets

Verified on HuggingFace Hub

Dataset	HF Path	Rows	Features	Label	Status
Criteo_x4	`reczoo/Criteo_x4`	45.8M	13 dense + 26 cat	`Label`	✅ Ready
Avazu_x4	`reczoo/Avazu_x4`	40.4M	22 fields	`click`	✅ Ready

RTB-Specific (External Only)

Dataset	Description	Availability
iPinYou	19.5M impressions, 9 campaigns, market prices included	data.computational-advertising.org
YOYI	~400M bid log records	Various mirrors

Key Gap: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use synthetic market price generation conditioned on Criteo features for evaluation.

5. Codebases & Libraries

Library	URL	Use
FuxiCTR	github.com/reczoo/FuxiCTR	40+ CTR models, config-driven
DeepCTR-Torch	github.com/shenweichen/DeepCTR-Torch	20+ CTR models
TorchSurv	github.com/Novartis/torchsurv	Survival analysis for clearing price
BARS	github.com/openbenchmark/BARS	Standardized CTR benchmark
rlb-dp	github.com/han-cai/rlb-dp	RL for RTB (baseline reference)

6. Recommended Architecture

┌──────────────────────────────────────────────────────┐
│              FIRST-PRICE BIDDING ENGINE               │
│                                                       │
│  Dual OGD: λ_{t+1} = max(0, λ_t - ε·(ρ - c̃_t(b_t))) │
│  Two-Sided: μ (cap) + ν (floor) dual variables       │
│                                                       │
├──────────────────────────────────────────────────────┤
│                PREDICTION MODELS                       │
│                                                       │
│  ┌─────────────────┐  ┌──────────────────────────┐   │
│  │ CTR: FinalMLP   │  │ Win Prob: Empirical CDF   │   │
│  │ v_t = pCTR × V  │  │ G̃_t(b) = frac of d_s ≤ b │   │
│  └─────────────────┘  └──────────────────────────┘   │
│                                                       │
│  ┌──────────────────────────────────────────────┐    │
│  │ Optional: TorchSurv for contextual win prob  │    │
│  │ P(win|b,x) = 1 - S(b|x)                      │    │
│  └──────────────────────────────────────────────┘    │
├──────────────────────────────────────────────────────┤
│                   DATASETS                            │
│  Criteo_x4 (CTR training) + synthetic market prices  │
└──────────────────────────────────────────────────────┘

Paper Index

#	Paper	arXiv	Focus
1	Wang et al. — Learning to Bid in Repeated FPA with Budgets	2304.13477	⭐ Primary algorithm
2	— Adaptive Bidding under Non-Stationarity	2505.02796	Distribution shift
3	— Contextual First-Price (Quantile)	2603.07207	Contextual extension
4	— Joint Value Estimation and Bidding	2502.17292	Simultaneous CTR + bidding
5	— No-Regret in Repeated FPA with Budgets	2205.14572	General framework
6	Cai et al. — RLB	1701.02490	RL baseline
7	— Leveraging Hints: Adaptive Bidding	2211.06358	Hints/forecasts
8	Mao et al. — FinalMLP	2304.00902	CTR model
9	Wang et al. — DCN V2	2008.13535	CTR model
10	Guo et al. — DeepFM	—	CTR model
11	Zhu et al. — BARS-CTR	2009.05794	CTR benchmark
12	Wu et al. — Censored Price Prediction	—	Clearing price
13	— TorchSurv	2404.10761	Survival analysis library