arxiv:2605.19008

Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

Published on May 18

· Submitted by

Anis Radianis on May 21

Qluon

Upvote

Authors:

Anis Radianis

Abstract

Learn-by-Wire Guard (LBW-Guard) enhances language model training stability and efficiency by providing bounded autonomous control over optimizer execution without altering the underlying training objective.

AI-generated summary

Modern language-model training is increasingly exposed to instability, degraded runs, and wasted compute, especially under aggressive learning-rate, scale, and runtime-stress conditions. This paper introduces Learn-by-Wire Guard (LBW-Guard), a bounded autonomous training-control governance layer that operates above AdamW. Rather than replacing the optimizer update rule, LBW-Guard observes training telemetry, interprets instability-sensitive regimes, and applies bounded control to optimizer execution while preserving fixed training objectives. We evaluate LBW-Guard in a Qwen2.5-centered stress-and-robustness suite using WikiText-103, with Qwen2.5-7B as the empirical anchor, model-size comparisons against Qwen2.5-3B and Qwen2.5-14B, learning-rate stress tests, gradient-clipping baselines, and a no-LoRA TinyLlama-1B full-parameter sanity check. In the 7B reference setting, LBW-Guard reduces final perplexity from 13.21 to 10.74, an 18.7% improvement, while reducing end-to-end time from 392.54s to 357.02s, a 1.10x speedup. Under stronger learning-rate stress, AdamW degrades to 1885.24 final perplexity at LR=3e-3 and 659.76 at LR=1e-3, whereas LBW-Guard remains trainable at 11.57 and 10.33, respectively. Gradient-clipping baselines do not reproduce this effect. These results support a scoped systems conclusion that stability-sensitive LLM training can benefit from a governance plane above the optimizer. LBW-Guard provides evidence that bounded runtime control can preserve productive compute under stress while remaining distinct from optimizer replacement and local gradient suppression.

View arXiv page View PDF Project page GitHub 0 Add to collection

Community

aradianis

Paper author Paper submitter about 5 hours ago

LBW-Guard is a bounded training-control governance layer above AdamW. It observes training telemetry and applies corrective control when destabilizing regimes form — distinct from gradient clipping. Benchmarked on Qwen2.5-3B/7B/14B and TinyLlama-1B: 18.7% perplexity reduction, 1.10x speedup, and stability at LR=3e-3 where AdamW diverges to 1885. PyPI package and live HF Space available.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Get this paper in your agent:

hf papers read 2605.19008

Don't have the latest CLI?

curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2605.19008 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2605.19008 in a dataset README.md to link it from this page.

Learn-by-Wire Training Control Governance: Bounded Autonomous Training Under Stress for Stability and Efficiency

Abstract

Community

Models citing this paper 0

Datasets citing this paper 0

Spaces citing this paper 2

Collections including this paper 1