# Bidding Algorithms Benchmark — Research Resources

> First-Price Auction Focus | Generated: 2026-05-05
> Repo: https://huggingface.co/hamverbot/bidding_algorithms_benchmark

---

## 1. Bidding Algorithms for First-Price Auctions

### 1.1 DualOGD — Lagrangian Dual + Online Gradient Descent ⭐ PRIMARY

| Property | Detail |
|----------|--------|
| **Paper** | "Learning to Bid in Repeated First-Price Auctions with Budgets" |
| **Authors** | Qian Wang, Zongjun Yang, Xiaotie Deng, Yuqing Kong |
| **Venue** | 2023 |
| **arXiv** | [2304.13477](https://arxiv.org/abs/2304.13477) |
| **HF Papers** | https://huggingface.co/papers/2304.13477 |
| **Algorithm** | DualOGD — Lagrangian dual multiplier updated by online error gradient descent |
| **Auction Type** | **First-price** |
| **Constraints** | Budget cap: total spend ≤ ρT |
| **Regret Bound** | Õ(√T) for both full-information and one-sided feedback |
| **Key Formula** | λ_{t+1} = Proj_{λ>0}(λ_t − ε·(ρ − c̃_t(b_t))) |
| **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − λ_t·c̃_t(b)) |
| **Prediction Models** | CTR predictor (v_t) + empirical CDF of competing bids (G̃_t) |
| **Code Pattern (Wang 2023, Algorithm 1)** |
```python
# Initialization
λ = 0.0; ε = 1.0 / sqrt(T); ρ = B / T

for t in range(T):
    v_t = pCTR(features_t) * click_value  # from CTR model
    
    # Estimate reward and cost from historical competing bids
    r_tilde = lambda b: mean([(v_t - b) for d in d_history if b >= d])
    c_tilde = lambda b: mean([b for d in d_history if b >= d])
    
    # Bid: maximize cost-adjusted reward
    b_t = argmax_b (r_tilde(v_t, b) - λ * c_tilde(b))
    
    # Observe maximum competing bid d_t (full feedback)
    won = (b_t >= d_t); cost = b_t if won else 0
    
    # Online gradient descent on dual multiplier
    λ = max(0, λ - ε * (ρ - c_tilde(b_t)))
```

### 1.2 TwoSidedDual — Budget Cap + Spend Floor

| Property | Detail |
|----------|--------|
| **Base** | Extension of Wang et al. (2023) |
| **Constraints** | Total spend ≤ B (cap) AND spend ≥ k·B (floor) |
| **Dual Variables** | μ for cap, ν for floor |
| **Updates** |
| μ_{t+1} = Proj_{μ≥0}(μ_t − η₁·(ρ − c̃_t(b_t))) | cap penalty |
| ν_{t+1} = Proj_{ν≥0}(ν_t − η₂·(c̃_t(b_t) − kρ)) | floor incentive |
| **Bid Rule** | b_t = argmax_b (r̃_t(v_t, b) − (μ_t − ν_t)·c̃_t(b)) |
| **Key Insight** | When μ > ν: bidding is restrained (ahead on spend). When ν > μ: bidding is encouraged (behind on spend floor). |

### 1.3 Adversarial Bidding — Non-Stationary Environments

| Property | Detail |
|----------|--------|
| **Paper** | "Adaptive Bidding Policies for First-Price Auctions with Budget Constraints under Non-stationarity" |
| **arXiv** | [2505.02796](https://arxiv.org/abs/2505.02796) |
| **Algorithm** | Adaptive dual OGD with change-point detection |
| **Key Insight** | When distribution shifts, resets dual multiplier and restarts learning |

### 1.4 Contextual First-Price (2026)

| Property | Detail |
|----------|--------|
| **Paper** | "Online Bidding for Contextual First-Price Auctions with Budgets under One-Sided Information Feedback" |
| **arXiv** | [2603.07207](https://arxiv.org/abs/2603.07207) |
| **Algorithm** | Dual OGD + quantile-based contextual censored regression |
| **Key Innovation** | Extends Wang to contextual (feature-based) auctions |

### 1.5 Joint Value Estimation + Bidding

| Property | Detail |
|----------|--------|
| **Paper** | "Joint Value Estimation and Bidding in Repeated First-Price Auctions" |
| **arXiv** | [2502.17292](https://arxiv.org/abs/2502.17292) |
| **Key Insight** | Simultaneously learn CTR and bidding strategy — no separate CTR model training phase |

### 1.6 RLB — Reinforcement Learning (Baseline)

| Property | Detail |
|----------|--------|
| **Paper** | "Real-Time Bidding by Reinforcement Learning in Display Advertising" |
| **Authors** | Han Cai et al., WSDM 2017 |
| **arXiv** | [1701.02490](https://arxiv.org/abs/1701.02490) |
| **GitHub** | https://github.com/han-cai/rlb-dp (188 stars) |
| **Algorithm** | MDP + Dynamic Programming + Neural value function |
| **State** | (remaining auctions, remaining budget, features) |
| **Action** | bid price a ∈ [0, budget] |
| **Prediction Models** | CTR θ(x) + market price distribution m(δ, x) |

### 1.7 Static Baselines

| Algorithm | Bid Rule | Notes |
|-----------|----------|-------|
| **Linear** | b = base_bid × (pCTR / avg_pCTR) | Simple proportional |
| **Threshold** | b = fixed_bid if pCTR > τ else 0 | Binary decision |
| **ValueShading** | b = v_t / (1 + λ) | From second-price literature, adapted |

### Algorithm Comparison Matrix

| Algorithm | Adaptive? | Market Price Model? | Two-Sided? | Provable Regret? | Complexity |
|-----------|-----------|-------------------|------------|-----------------|------------|
| **DualOGD** | ✅ Online | Empirical CDF | ❌ (cap only) | ✅ Õ(√T) | Medium |
| **TwoSidedDual** | ✅ Online | Empirical CDF | ✅ (cap+floor) | ❌ (heuristic) | Medium |
| **RLB** | ✅ DP | Neural dist. model | ❌ | ❌ | High |
| **Linear** | ❌ | None | ❌ | ❌ | Minimal |
| **Threshold** | ❌ | None | ❌ | ❌ | Minimal |

---

## 2. CTR Prediction Models

### 2.1 FinalMLP ⭐ RECOMMENDED

| Property | Detail |
|----------|--------|
| **Paper** | "FinalMLP: An Enhanced Two-Stream MLP Model for CTR Prediction" |
| **Authors** | Kelong Mao et al., AAAI 2023 |
| **arXiv** | [2304.00902](https://arxiv.org/abs/2304.00902) |
| **Criteo AUC** | **0.8149** |
| **Architecture** | Two independent MLP towers + feature gating (soft selection) + bilinear fusion |
| **Why Best for RTB** | Pure feed-forward MLP — <1ms inference, no attention/RNN overhead |
| **Library** | FuxiCTR (`pip install fuxictr`) or DeepCTR-Torch |

```python
# FinalMLP architecture
# Stream 1: MLP(features * gate_weights) → feature selection
# Stream 2: MLP(features * (1-gate_weights)) → complementary view
# Fusion: Bilinear(stream1_output, stream2_output) → sigmoid
```

### 2.2 DeepFM — Simple Baseline

| Property | Detail |
|----------|--------|
| **Paper** | "DeepFM: A Factorization-Machine based Neural Network" |
| **Criteo AUC** | 0.8138 |
| **Architecture** | Shared embedding → FM (2nd-order) + DNN → sum → sigmoid |

### 2.3 DCNv2 — Industry Standard

| Property | Detail |
|----------|--------|
| **Paper** | "DCN V2: Improved Deep & Cross Network" (WWW 2021) |
| **arXiv** | [2008.13535](https://arxiv.org/abs/2008.13535) |
| **Criteo AUC** | 0.8142-0.8144 |
| **Architecture** | CrossNetV2 (low-rank) + DNN in parallel |

### 2.4 BARS Meta-Finding ⚠️ IMPORTANT

| Property | Detail |
|----------|--------|
| **Paper** | "BARS-CTR: Open Benchmarking" [2009.05794] |
| **Finding** | After 7,000+ experiments: **differences between SOTA CTR models are ≤0.1-0.3% AUC** |
| **Implication** | Architecture choice matters less than data preprocessing, hyperparameter tuning, and feature engineering |

### CTR Model Comparison for RTB

| Model | AUC (Criteo) | Inference Speed | RTB Latency OK? |
|-------|-------------|-----------------|-----------------|
| **FinalMLP** | 0.8149 | ⭐⭐⭐⭐⭐ | ✅ Yes |
| DCNv2 | 0.8142 | ⭐⭐⭐⭐ | ✅ Yes |
| DeepFM | 0.8138 | ⭐⭐⭐⭐ | ✅ Yes |
| GDCN | 0.8161* | ⭐⭐⭐⭐ | ✅ Yes |
| DIN | — | ⭐⭐ | ❌ No |
| DIEN | — | ⭐ | ❌ No |

*Own data split, not directly comparable.

---

## 3. Clearing Price / Win Probability Prediction

### 3.1 Empirical CDF (Non-Parametric) ⭐ BASELINE

| Property | Detail |
|----------|--------|
| **Source** | Wang et al. (2023), Algorithm 1, Section 3.1 |
| **Method** | Maintain array of observed competing bids d_s |
| **Win Probability** | P(win|b) = G̃_t(b) = (1/(t-1))∑_{s=1}^{t-1} 𝟙{b ≥ d_s} |
| **Expected Cost** | E[cost|win,b] = (1/G̃_t(b)) · mean({d_s : d_s ≤ b}) |
| **Expected Reward** | r̃_t(v,b) = (v - b) · G̃_t(b) |
| **Expected Cost (dual)** | c̃_t(b) = b · G̃_t(b) |
| **Pros** | No training, theoretically sound, adapts online |
| **Cons** | No context, cold-start issue, requires full feedback |

```python
class EmpiricalCDF:
    def __init__(self):
        self.competing_bids = []
    
    def update(self, d_t):
        """d_t = maximum competing bid (observed under full feedback)"""
        self.competing_bids.append(d_t)
    
    def win_prob(self, b):
        if not self.competing_bids:
            return 0.5
        return np.mean([1.0 if b >= d else 0.0 for d in self.competing_bids])
    
    def expected_cost(self, b):
        wins = [d for d in self.competing_bids if b >= d]
        if not wins:
            return b
        return np.mean(wins)
```

### 3.2 TorchSurv — Deep Censored Learning

| Property | Detail |
|----------|--------|
| **Library** | TorchSurv (Novartis, 200★) |
| **Paper** | [2404.10761](https://arxiv.org/abs/2404.10761) |
| **GitHub** | https://github.com/Novartis/torchsurv |
| **Install** | `pip install torchsurv` |
| **Method** | Neural network with Cox PH or Weibull AFT loss |
| **Censoring** | Win = uncensored (exact price), Loss = right-censored (price > bid) |
| **Output** | Survival function S(b|x) = P(market_price > b | features) |
| **Win Prob** | P(win|b,x) = 1 − S(b|x) |

```python
from torchsurv.loss import cox

# log_hazard = model(features)  # shape (batch,)
# event = 1 if won (uncensored), 0 if lost (censored)
# time = market_price if won, bid if lost
loss = cox.neg_partial_log_likelihood(log_hazard, event, time)
```

### 3.3 Win Probability NN (Simplest ML)

| Property | Detail |
|----------|--------|
| **Method** | Binary classifier: P(win | bid_price, features) |
| **Pros** | Dead simple, BCELoss |
| **Cons** | Ignores price magnitude info when winning |
| **Architecture** | features ⊕ bid_price → MLP → sigmoid |

---

## 4. Datasets

### Verified on HuggingFace Hub

| Dataset | HF Path | Rows | Features | Label | Status |
|---------|---------|------|----------|-------|--------|
| **Criteo_x4** | `reczoo/Criteo_x4` | 45.8M | 13 dense + 26 cat | `Label` | ✅ Ready |
| Avazu_x4 | `reczoo/Avazu_x4` | 40.4M | 22 fields | `click` | ✅ Ready |

### RTB-Specific (External Only)

| Dataset | Description | Availability |
|---------|-------------|-------------|
| iPinYou | 19.5M impressions, 9 campaigns, market prices included | data.computational-advertising.org |
| YOYI | ~400M bid log records | Various mirrors |

**Key Gap**: No first-price auction bid logs on HF Hub. Criteo/Avazu have click labels but no bid/market price columns. We use **synthetic market price generation** conditioned on Criteo features for evaluation.

---

## 5. Codebases & Libraries

| Library | URL | Use |
|---------|-----|-----|
| FuxiCTR | github.com/reczoo/FuxiCTR | 40+ CTR models, config-driven |
| DeepCTR-Torch | github.com/shenweichen/DeepCTR-Torch | 20+ CTR models |
| TorchSurv | github.com/Novartis/torchsurv | Survival analysis for clearing price |
| BARS | github.com/openbenchmark/BARS | Standardized CTR benchmark |
| rlb-dp | github.com/han-cai/rlb-dp | RL for RTB (baseline reference) |

---

## 6. Recommended Architecture

```
┌──────────────────────────────────────────────────────┐
│              FIRST-PRICE BIDDING ENGINE               │
│                                                       │
│  Dual OGD: λ_{t+1} = max(0, λ_t - ε·(ρ - c̃_t(b_t))) │
│  Two-Sided: μ (cap) + ν (floor) dual variables       │
│                                                       │
├──────────────────────────────────────────────────────┤
│                PREDICTION MODELS                       │
│                                                       │
│  ┌─────────────────┐  ┌──────────────────────────┐   │
│  │ CTR: FinalMLP   │  │ Win Prob: Empirical CDF   │   │
│  │ v_t = pCTR × V  │  │ G̃_t(b) = frac of d_s ≤ b │   │
│  └─────────────────┘  └──────────────────────────┘   │
│                                                       │
│  ┌──────────────────────────────────────────────┐    │
│  │ Optional: TorchSurv for contextual win prob  │    │
│  │ P(win|b,x) = 1 - S(b|x)                      │    │
│  └──────────────────────────────────────────────┘    │
├──────────────────────────────────────────────────────┤
│                   DATASETS                            │
│  Criteo_x4 (CTR training) + synthetic market prices  │
└──────────────────────────────────────────────────────┘
```

---

## Paper Index

| # | Paper | arXiv | Focus |
|---|-------|-------|-------|
| 1 | Wang et al. — Learning to Bid in Repeated FPA with Budgets | 2304.13477 | ⭐ Primary algorithm |
| 2 | — Adaptive Bidding under Non-Stationarity | 2505.02796 | Distribution shift |
| 3 | — Contextual First-Price (Quantile) | 2603.07207 | Contextual extension |
| 4 | — Joint Value Estimation and Bidding | 2502.17292 | Simultaneous CTR + bidding |
| 5 | — No-Regret in Repeated FPA with Budgets | 2205.14572 | General framework |
| 6 | Cai et al. — RLB | 1701.02490 | RL baseline |
| 7 | — Leveraging Hints: Adaptive Bidding | 2211.06358 | Hints/forecasts |
| 8 | Mao et al. — FinalMLP | 2304.00902 | CTR model |
| 9 | Wang et al. — DCN V2 | 2008.13535 | CTR model |
| 10 | Guo et al. — DeepFM | — | CTR model |
| 11 | Zhu et al. — BARS-CTR | 2009.05794 | CTR benchmark |
| 12 | Wu et al. — Censored Price Prediction | — | Clearing price |
| 13 | — TorchSurv | 2404.10761 | Survival analysis library |