YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

ARIA: Closed-Loop Reliability Control for Autoregressive Decoding

Runtime reliability middleware for LLM inference.

ARIA is a lightweight, training-free control system that hooks into any HuggingFace Transformers model. It observes hidden states and logit distributions, detects anomalous behavior via calibrated statistical observers, and applies minimal corrective control inputs through proportional feedback β€” all via standard PyTorch forward hooks.

v0.4 β€” What Changed (and Why)

v0.3 benchmark results were honest but damning: ARIA hurt GSM8K accuracy by 5% (90% β†’ 85%) on Qwen3-8B-AWQ. The loop detector triggered on 65.6% of normal reasoning steps, and the trajectory diverger perturbed correct chains into incorrect ones.

v0.4 fixes every identified issue:

Problem (v0.3) Root Cause Fix (v0.4)
Loop detector over-fires on reasoning Entropy variance collapse β‰  content repetition Content-aware detection: token trigram repetition ratio
Trigger threshold too low severity > 0.5 caught normal variation Raised to 0.7 across all observers
No ablation evidence Couldn't prove each observer matters Built-in ablation mode: disable any observer independently
No stability evidence No proof ARIA preserves output distribution Perplexity tracking: log P(top-1) with/without corrections
"Heuristic soup" criticism No unifying framework Control-theoretic framing: observer β†’ controller β†’ plant
No orthogonality evidence Failure modes might be correlated Signal vector logging + PCA correlation matrix

Framing: Proportional Feedback Control

Following A-LQR (arxiv:2604.19018), we model the LLM as a dynamical system:

Plant:      z_{k+1} = Ο†_k(z_k) + u_k      (transformer dynamics + control input)
Observer:   s_k = [Οƒ_compound, Οƒ_drift, Οƒ_loop, Οƒ_median]  (4-output state estimator)
Controller: u_k = -K Β· max(Οƒ) Β· v_k         (proportional feedback)

Where:

  • z_k ∈ ℝ^d = hidden state at layer k (intervention at layer β„“/2)
  • Ο†_k = frozen transformer block
  • s_k = observer output (calibrated severity per failure mode)
  • K = auto-tuned proportional gain (from calibration variance)
  • v_k = correction direction (EMA/anchor/orthogonal depending on failure mode)

Honest difference from A-LQR: We use proportional control (P-controller) instead of LQR because we don't compute per-layer Jacobians (too expensive for middleware). A-LQR is optimal for single objectives; ARIA trades optimality for multi-objective coverage with zero setup cost.

Observers (Detectors)

Observer Signal Calibration Trigger Paper Basis
Compound Error JSD(p_t, p_{t-1}) + H_norm(p_t) mean + 2.5Οƒ severity > 0.7 arxiv:2602.02863
Semantic Drift 1 - cos(h_t, h_0) mean + 2.5Οƒ severity > 0.7 CAST arxiv:2409.05907
Logic Loop Trigram repetition ratio (v0.4) mean + 2.5Οƒ severity > 0.7 Content-aware, not entropy-based
Median Trap top-1 prob + inverse top-K entropy + TTR mean + 2.5Οƒ severity > 0.7, 2/3 agree ITI

Controllers (Correctors)

Controller Control Law When
Steering u = K Β· Οƒ Β· (EMA - h) / β€–EMA - hβ€– Β· β€–hβ€– Compound error detected
Goal Anchor u = K Β· Οƒ Β· (h_0 - h) Semantic drift detected
Divergence u = K Β· Οƒ Β· v_βŠ₯ Β· β€–hβ€– (Gram-Schmidt orthogonal) Logic loop detected
Logit Temp logits /= (1 + 0.15Β·KΒ·Οƒ), top-3 suppressed Median trap detected

Budget: max 1 correction per step. Highest severity wins.

Benchmark Results

Qwen3-8B-AWQ on T4 (v0.3 results β€” v0.4 pending your Colab run)

Config GSM8K (20) Code (10) Corrections Loop triggers
Baseline 90.0% 100.0% 0 β€”
ARIA v0.3 85.0% ❌ 100.0% 6,724 336/512 (65.6%)
ARIA v0.4 TBD TBD TBD TBD

v0.4's trigram-based loop detector should dramatically reduce false positives.

Ablation Study (built into v0.4 script)

The script automatically runs 5 ablation configs:

  1. Full ARIA (all observers)
  2. No compound error observer
  3. No semantic drift observer
  4. No logic loop observer
  5. No median trap observer
  6. Observe-only (no corrections)

What This Is (Honestly)

ARIA is runtime reliability middleware. Not SOTA on any single task. Not a replacement for better training. Not magic.

It's the LLM equivalent of TCP checksums or PID controllers β€” imperfect components + a correction layer = better compound reliability. The math is P_s = ∏(R_base + Ξ”R_i) instead of P_s = R^n.

Current status (April 2026)

Aspect Rating Evidence
Concept novelty 8/10 No other paper combines 4-mode detection + budget-limited correction
Current evidence 5.5/10 β†’ TBD v0.3 hurt GSM8K; v0.4 fixes identified; awaiting Colab results
As systems paper 7.5/10 Control-theoretic framing + ablations would strengthen
"SOTA" claim No A-LQR beats us on single objectives

What would make it strong

  1. βœ… Control-theoretic framing (v0.4)
  2. βœ… Ablation study (v0.4)
  3. βœ… Perplexity preservation measurement (v0.4)
  4. βœ… PCA orthogonality analysis (v0.4)
  5. ⬜ Full benchmark suite (GSM8K-1319, MATH-500, HumanEval, TruthfulQA)
  6. ⬜ Head-to-head vs ITI, CAA, A-LQR
  7. ⬜ Formal Lyapunov stability proof

Install

pip install torch transformers
git clone https://huggingface.co/SofiTesfay2010/aria-llm
cd aria-llm
pip install -e .

Quick Start

from aria_llm import ARIA, ARIAConfig

aria = ARIA.attach(model, tokenizer, cs=20, sk=2.5, auto=True, verbose=True)
output = model.generate(input_ids, max_new_tokens=500)
print(aria.report_text())
aria.detach()

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for SofiTesfay2010/aria-llm