Improve model card for FAAST-Qwen2.5-3B-Instruct

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +34 -11
README.md CHANGED
@@ -1,12 +1,5 @@
1
  ---
2
- license: apache-2.0
3
  base_model: Qwen/Qwen2.5-3B-Instruct
4
- tags:
5
- - qwen2.5
6
- - test-time-learning
7
- - fast-weights
8
- - adaptation
9
- - multilingual
10
  datasets:
11
  - OpenWebText2
12
  - IWSLT2017
@@ -14,25 +7,55 @@ language:
14
  - en
15
  - de
16
  - fr
 
 
 
 
 
 
 
 
 
17
  ---
18
 
19
  # FAAST-Qwen2.5-3B-Instruct
20
 
21
- `faast-Qwen2.5-3B-Instruct` is an extension of `Qwen2.5-3B-Instruct` equipped with the FAAST module. The original Qwen2.5-3B-Instruct parameters are frozen, while only the FAAST readout projections are trained.
22
 
23
- The model is designed for efficient test-time learning through fast weights, enabling adaptation without backpropagation (gradient descent).
 
 
24
 
25
  ## Model Description
26
 
27
  FAAST augments Qwen2.5-3B-Instruct with fast-weight adaptation modules that support supervised learning during inference. During FAAST pretraining, all backbone LLM parameters remain frozen, and only lightweight FAAST readout projections are optimized.
28
 
29
  This design enables:
30
-
31
  - Test-time learning without backpropagation
32
  - Efficient adaptation with low memory overhead
33
  - Fast adaptation to downstream tasks
34
  - Improved few-shot and full-data performance
35
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
36
  ## Training Details
37
 
38
  - **Base model:** Qwen2.5-3B-Instruct
@@ -82,7 +105,7 @@ BLEU scores on IWSLT2017. Bold scores indicate statistical significance at `p <
82
 
83
  ## Citation
84
 
85
- If you use this model, please cite the corresponding [FAAST paper](https://arxiv.org/pdf/2605.04651) or [project](https://github.com/baoguangsheng/faast).
86
 
87
  ```bibtex
88
  @article{bao2026faast,
 
1
  ---
 
2
  base_model: Qwen/Qwen2.5-3B-Instruct
 
 
 
 
 
 
3
  datasets:
4
  - OpenWebText2
5
  - IWSLT2017
 
7
  - en
8
  - de
9
  - fr
10
+ license: apache-2.0
11
+ library_name: transformers
12
+ pipeline_tag: text-generation
13
+ tags:
14
+ - qwen2.5
15
+ - test-time-learning
16
+ - fast-weights
17
+ - adaptation
18
+ - multilingual
19
  ---
20
 
21
  # FAAST-Qwen2.5-3B-Instruct
22
 
23
+ `faast-Qwen2.5-3B-Instruct` is an extension of `Qwen2.5-3B-Instruct` equipped with the FAAST module as presented in [FAAST: Forward-Only Associative Learning via Closed-Form Fast Weights for Test-Time Supervised Adaptation](https://huggingface.co/papers/2605.04651).
24
 
25
+ The original Qwen2.5-3B-Instruct parameters are frozen, while only the FAAST readout projections are trained. The model is designed for efficient test-time learning through fast weights, enabling adaptation without backpropagation (gradient descent).
26
+
27
+ The official implementation is available at [https://github.com/baoguangsheng/faast](https://github.com/baoguangsheng/faast).
28
 
29
  ## Model Description
30
 
31
  FAAST augments Qwen2.5-3B-Instruct with fast-weight adaptation modules that support supervised learning during inference. During FAAST pretraining, all backbone LLM parameters remain frozen, and only lightweight FAAST readout projections are optimized.
32
 
33
  This design enables:
 
34
  - Test-time learning without backpropagation
35
  - Efficient adaptation with low memory overhead
36
  - Fast adaptation to downstream tasks
37
  - Improved few-shot and full-data performance
38
 
39
+ ## Usage
40
+
41
+ These models can learn at test time. Below is a sample snippet showing how to use the model with `transformers` (requires `trust_remote_code=True`):
42
+
43
+ ```python
44
+ from transformers import AutoTokenizer, AutoModelForCausalLM
45
+
46
+ model_path = "gshbao/faast-Qwen2.5-3B-Instruct"
47
+ tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
48
+ model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True)
49
+
50
+ # Labeled examples for adaptation
51
+ fewshot_samples = ['sample 1', 'sample 2', ...]
52
+ inputs = tokenizer(fewshot_samples, return_tensors="pt", padding=True)
53
+
54
+ model.reset_projection() # clear existing fast weights
55
+ model.learn(**inputs) # learn new fast weights from labeled examples
56
+ model.generate(...) # do the task using the learned fast weights
57
+ ```
58
+
59
  ## Training Details
60
 
61
  - **Base model:** Qwen2.5-3B-Instruct
 
105
 
106
  ## Citation
107
 
108
+ If you use this model, please cite the corresponding [FAAST paper](https://huggingface.co/papers/2605.04651) or [project](https://github.com/baoguangsheng/faast).
109
 
110
  ```bibtex
111
  @article{bao2026faast,