Initial release: FLAS Gemma-2-9B-IT checkpoint
Browse files- README.md +57 -0
- config.json +10 -0
- flas-gemma-2-9b-it.safetensors +3 -0
README.md
ADDED
|
@@ -0,0 +1,57 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: google/gemma-2-9b-it
|
| 4 |
+
library_name: flas
|
| 5 |
+
tags:
|
| 6 |
+
- activation-steering
|
| 7 |
+
- flow-matching
|
| 8 |
+
- gemma-2
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# FLAS — Gemma-2-9B-IT
|
| 12 |
+
|
| 13 |
+
**Steer Gemma toward any concept you can describe in words.** "Talk like a pirate." "Respond as a noir detective." "Always reference places in Minnesota." "Frame everything as a musical performance." "Speak in programming terms." "Use mathematical notation." Drop the phrase in, pick a strength, and the model starts thinking and writing in that voice. No fine-tuning, no per-concept training, no contrastive data.
|
| 14 |
+
|
| 15 |
+
This is the natural-language activation-steering checkpoint for `google/gemma-2-9b-it`, trained with **FLAS (Flow-based Activation Steering)**. Where prior work like [*Golden Gate Claude*](https://www.anthropic.com/news/golden-gate-claude) had to lock in a single behavior in advance, FLAS learns a single concept-conditioned velocity field $v_\theta(h, t, c)$. At inference you hand it any natural-language concept $c$ and it produces the right intervention on the fly. The same checkpoint handles thousands of unseen concepts.
|
| 16 |
+
|
| 17 |
+
- 📄 Paper: <https://arxiv.org/abs/2605.05892>
|
| 18 |
+
- 💻 Code: <https://github.com/flas-ai/FLAS>
|
| 19 |
+
|
| 20 |
+
## How it works
|
| 21 |
+
|
| 22 |
+
FLAS learns a concept-conditioned velocity field $v_\theta(h, t, c)$ that transports an unsteered activation $h$ to a steered activation $h'$ by integrating a flow ODE:
|
| 23 |
+
|
| 24 |
+
$$h' = \varphi_T(h) = h + \int_0^T v_\theta\!\bigl(\varphi_t(h),\, t,\, c\bigr)\, dt$$
|
| 25 |
+
|
| 26 |
+
The flow time $T$ serves as a continuous steering-strength parameter; sampling $T \sim \mathrm{Uniform}[T_{\min}, T_{\max}]$ during training enables zero-shot strength control at inference. FLAS is the first learned steering method to consistently outperform in-context prompting on AxBench.
|
| 27 |
+
|
| 28 |
+
## Files
|
| 29 |
+
|
| 30 |
+
| File | Description |
|
| 31 |
+
|---|---|
|
| 32 |
+
| `flas-gemma-2-9b-it.safetensors` | Flow function weights (255.6 M params, fp32, ~975 MB). |
|
| 33 |
+
| `config.json` | Architecture/training config consumed by the FLAS loader (`model_id`, `layer`, `num_blocks`, `n_steps`). |
|
| 34 |
+
|
| 35 |
+
The frozen concept encoder is **not** stored and is loaded from the base model's first two layers at load time.
|
| 36 |
+
|
| 37 |
+
## Hardware
|
| 38 |
+
|
| 39 |
+
End-to-end inference (Gemma-2-9B-IT bf16 + FlowFunction fp32 + ConceptEncoder fp32) uses about **24 GB peak VRAM** for 128-token generation: ~17.2 GB base model, ~1.0 GB flow function, ~4.9 GB concept encoder. A 24 GB GPU (RTX 3090 / 4090, A10G, L4) is the practical minimum.
|
| 40 |
+
|
| 41 |
+
## Usage
|
| 42 |
+
|
| 43 |
+
These weights are consumed by the FLAS reference implementation. See the codebase for installation, loader, and the chat CLI: <https://github.com/flas-ai/FLAS>.
|
| 44 |
+
|
| 45 |
+
## Citation
|
| 46 |
+
|
| 47 |
+
```bibtex
|
| 48 |
+
@article{flas2026,
|
| 49 |
+
title={Beyond Steering Vector: Flow-based Activation Steering for Inference-Time Intervention},
|
| 50 |
+
author={Zehao Jin and Ruixuan Deng and Junran Wang and Xinjie Shen and Chao Zhang},
|
| 51 |
+
year={2026},
|
| 52 |
+
eprint={2605.05892},
|
| 53 |
+
archivePrefix={arXiv},
|
| 54 |
+
primaryClass={cs.CL},
|
| 55 |
+
url={https://arxiv.org/abs/2605.05892},
|
| 56 |
+
}
|
| 57 |
+
```
|
config.json
ADDED
|
@@ -0,0 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
{
|
| 2 |
+
"model_id": "google/gemma-2-9b-it",
|
| 3 |
+
"layer": 20,
|
| 4 |
+
"num_blocks": 1,
|
| 5 |
+
"n_steps": 3,
|
| 6 |
+
"freeze_concept_enc": true,
|
| 7 |
+
"disable_cross_attn": false,
|
| 8 |
+
"disable_self_attn": false,
|
| 9 |
+
"disable_mlp": false
|
| 10 |
+
}
|
flas-gemma-2-9b-it.safetensors
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:4750716c16c931ef97f1a48c9abf889093c81ee1ed7721f4c779b5498f720f71
|
| 3 |
+
size 1022261880
|