Papers
arxiv:2605.19008

Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

Published on May 18
· Submitted by
Anis Radianis
on May 21
Authors:

Abstract

Learn-by-Wire Guard (LBW-Guard) enhances language model training stability and efficiency by providing bounded autonomous control over optimizer execution without altering the underlying training objective.

AI-generated summary

Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives. We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard reduces final perplexity from 13.21 to 10.74, an 18.7% improvement, while reducing end-to-end time from 392.54s to 357.02s, a 1.10x speedup. Under stronger learning-rate stress, AdamW degrades to 1885.24 final perplexity at LR=3e-3 and 659.76 at LR=1e-3, whereas LBW-Guard remains trainable at 11.57 and 10.33, respectively. Gradient-clipping baselines do not reproduce this effect. These results support a scoped systems conclusion that stability-sensitive LLM training can benefit from a governance plane above the optimizer. LBW-Guard provides evidence that bounded runtime control can preserve productive compute under stress while remaining distinct from optimizer replacement and local gradient suppression.

Community

Paper author Paper submitter

LBW-Guard is a bounded training-control governance layer above AdamW. It observes training telemetry and applies corrective control when destabilizing regimes form — distinct from gradient clipping. Benchmarked on Qwen2.5-3B/7B/14B and TinyLlama-1B: 18.7% perplexity reduction, 1.10x speedup, and stability at LR=3e-3 where AdamW diverges to 1885. PyPI package and live HF Space available.

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2605.19008
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.19008 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.19008 in a dataset README.md to link it from this page.

Spaces citing this paper 2

Collections including this paper 1