liucl26 commited on
Commit
c2849e7
Β·
verified Β·
1 Parent(s): 6cc810f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +9 -7
README.md CHANGED
@@ -8,7 +8,9 @@ tags:
8
  - hrm
9
  - hierarchical-reasoning
10
  - prefix-lm
11
- - base-model
 
 
12
  ---
13
 
14
  ![HRM-Text banner](banner.jpg)
@@ -21,20 +23,20 @@ tags:
21
 
22
  # HRM-Text-1B
23
 
24
- A 1 B-parameter base language model built on the **Hierarchical Reasoning Model (HRM)** architecture, trained from scratch on a curated text corpus by Sapient Intelligence.
25
 
26
- HRM is a dual-timescale recurrent architecture: two Transformer modules (H = high-level / slow, L = low-level / fast) iterate over the same input embeddings for `H_cycles Γ— L_cycles` steps, with additive state injection (`z_L + z_H`). This gives effectively unbounded compute depth at bounded parameter count.
27
 
28
  ## Disclaimer
29
 
30
- This is a **base** model. It is pre-trained on a PrefixLM objective with condition prefix tokens and has **not** been instruction-tuned, RLHF'd, or otherwise post-trained. For any serious downstream use we recommend post-training (SFT and/or RL) on task-specific data; the base checkpoint is meant as a starting point, not a finished assistant.
31
 
32
- Practical guidance for prompting the raw base model:
33
 
34
  - **NLP tasks (classification, extraction, structured output, short-form QA)**: use the `direct` condition with 2–8 few-shot in-context examples. `direct` + few-shot is the strongest zero-extra-training setup we have measured; pure zero-shot is noticeably weaker.
35
  - **Reasoning / math / open-ended generation**: use the **composite condition** `synth,cot`. This is *one* composite prefix, not two alternatives β€” at tokenization time the comma-separated tags are mapped to their prefix tokens and concatenated, in order, into a single prefix block. So `synth,cot` produces the two-token prefix `<|quad_end|><|object_ref_end|>` (synth first, then cot), wrapped in the usual `<|im_start|>` … `<|im_end|>` envelope. Under this composite the model exhibits some chain-of-thought / instruct-like behavior β€” enough to answer many zero-shot math and reasoning prompts in a step-by-step style β€” but quality is uneven and below an instruction-tuned model of comparable size. Treat this "instruct" ability as a side effect of the pre-training mix, not a guaranteed capability.
36
 
37
- The four single tags and their prefix tokens (for reference; you can compose any subset, comma-separated, in the order you want them emitted):
38
 
39
  - `direct` β†’ `<|object_ref_start|>` β€” direct answer, no CoT
40
  - `cot` β†’ `<|object_ref_end|>` β€” chain-of-thought
@@ -43,7 +45,7 @@ The four single tags and their prefix tokens (for reference; you can compose any
43
 
44
  ## Requirements
45
 
46
- The `hrm_text` model class has been merged into Transformers `main`. The PyPI release containing it may still be in flight; until then, install Transformers directly from the upstream `main` branch:
47
 
48
  ```bash
49
  pip install --upgrade "git+https://github.com/huggingface/transformers.git@main"
 
8
  - hrm
9
  - hierarchical-reasoning
10
  - prefix-lm
11
+ - pre-alignment
12
+ - non-chat
13
+ - non-instruction-tuned
14
  ---
15
 
16
  ![HRM-Text banner](banner.jpg)
 
23
 
24
  # HRM-Text-1B
25
 
26
+ A 1 B-parameter language model checkpoint built on the **Hierarchical Reasoning Model (HRM)** architecture, trained from scratch on a curated text corpus by Sapient Intelligence.
27
 
28
+ HRM is a dual-timescale recurrent architecture: two Transformer modules (H = high-level / slow, L = low-level / fast) iterate over the same input embeddings for `H_cycles Γ— (L_cycles + 1)` steps, with additive state injection (`z_L + z_H`). This gives effectively unbounded compute depth at bounded parameter count.
29
 
30
  ## Disclaimer
31
 
32
+ This is a **pre-alignment** model checkpoint, not a chat or instruction-following assistant. It is pre-trained on a PrefixLM objective with condition prefix tokens and has **not** been multi-turn dialogue tuned, long-context adapted, instruction-tuned, RLHF-trained, or otherwise aligned for assistant-style use. If you want to use HRM-Text like a chat model, you should perform further alignment, such as SFT and/or RL, on task-specific data. This checkpoint is meant as a starting point, not a finished assistant.
33
 
34
+ Practical guidance for prompting the raw checkpoint:
35
 
36
  - **NLP tasks (classification, extraction, structured output, short-form QA)**: use the `direct` condition with 2–8 few-shot in-context examples. `direct` + few-shot is the strongest zero-extra-training setup we have measured; pure zero-shot is noticeably weaker.
37
  - **Reasoning / math / open-ended generation**: use the **composite condition** `synth,cot`. This is *one* composite prefix, not two alternatives β€” at tokenization time the comma-separated tags are mapped to their prefix tokens and concatenated, in order, into a single prefix block. So `synth,cot` produces the two-token prefix `<|quad_end|><|object_ref_end|>` (synth first, then cot), wrapped in the usual `<|im_start|>` … `<|im_end|>` envelope. Under this composite the model exhibits some chain-of-thought / instruct-like behavior β€” enough to answer many zero-shot math and reasoning prompts in a step-by-step style β€” but quality is uneven and below an instruction-tuned model of comparable size. Treat this "instruct" ability as a side effect of the pre-training mix, not a guaranteed capability.
38
 
39
+ The four single condition tags and their assigned tokenizer special tokens (token names are legacy implementation details; you can compose any subset, comma-separated, in the order you want them emitted):
40
 
41
  - `direct` β†’ `<|object_ref_start|>` β€” direct answer, no CoT
42
  - `cot` β†’ `<|object_ref_end|>` β€” chain-of-thought
 
45
 
46
  ## Requirements
47
 
48
+ Use a Transformers build that includes the `hrm_text` model class. If your installed release does not include it yet, install Transformers directly from the upstream `main` branch:
49
 
50
  ```bash
51
  pip install --upgrade "git+https://github.com/huggingface/transformers.git@main"