FLAS — Gemma-2-2B-IT

Steer Gemma toward any concept you can describe in words. "Talk like a pirate." "Respond as a noir detective." "Always reference places in Minnesota." "Frame everything as a musical performance." "Speak in programming terms." "Use mathematical notation." Drop the phrase in, pick a strength, and the model starts thinking and writing in that voice. No fine-tuning, no per-concept training, no contrastive data.

Hardware requirement: any 6 GB+ GPU. End-to-end interactive inference (base model + FLAS modules) peaks at ~5 GB VRAM.

This is the natural-language activation-steering checkpoint for google/gemma-2-2b-it, trained with FLAS (Flow-based Activation Steering). Where prior work like Golden Gate Claude had to lock in a single behavior in advance, FLAS learns a single concept-conditioned velocity field $v_\theta(h, t, c)$. At inference you hand it any natural-language concept $c$ and it produces the right intervention on the fly. The same checkpoint handles thousands of unseen concepts.

📄 Paper: https://arxiv.org/abs/2605.05892
💻 Code: https://github.com/flas-ai/FLAS

How it works

FLAS learns a concept-conditioned velocity field $v_\theta(h, t, c)$ that transports an unsteered activation $h$ to a steered activation $h'$ by integrating a flow ODE:

$h' = \varphi_T(h) = h + \int_0^T v_\theta\!\bigl(\varphi_t(h),\, t,\, c\bigr)\, dt$

The flow time $T$ serves as a continuous steering-strength parameter; sampling $T \sim \mathrm{Uniform}[T_{\min}, T_{\max}]$ during training enables zero-shot strength control at inference. FLAS is the first learned steering method to consistently outperform in-context prompting on AxBench.

Files

File	Description
`flas-gemma-2-2b-it.safetensors`	Flow function weights (97.6 M params, ~187 MB).
`config.json`	Architecture/training config consumed by the FLAS loader (`model_id`, `layer`, `num_blocks`, `n_steps`).

The frozen concept encoder is not stored — at load time it shares the embedding and first two decoder layers with the base model in VRAM (no duplicate copies).

Usage

These weights are consumed by the FLAS reference implementation. See the codebase for installation, loader, and the chat CLI: https://github.com/flas-ai/FLAS.

Citation

@article{flas2026,
  title={Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention}, 
  author={Zehao Jin and Ruixuan Deng and Junran Wang and Xinjie Shen and Chao Zhang},
  year={2026},
  eprint={2605.05892},
  archivePrefix={arXiv},
  primaryClass={cs.CL},
  url={https://arxiv.org/abs/2605.05892}, 
}