File size: 1,010 Bytes
a4e273f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
# HF Template

Populate this folder on the training machine with a working HF model snapshot
(Qwen3 + Summary Attention variant) **before** running
`examples/pretrain/convert/convert_muse_to_hf.sh`.

## Expected contents

| File | Purpose |
|---|---|
| `config.json` | HF config with `summary_*` fields matching your trained model |
| `generation_config.json` | Default generation settings |
| `tokenizer.json` / `tokenizer_config.json` / `special_tokens_map.json` | Tokenizer |
| `vocab.json` / `merges.txt` | Tokenizer vocab (if applicable) |
| `modeling_qwen3*.py` | HF-compatible modeling code with SA support |
| `summary_context.py` | Helper module imported by the modeling code |

Only the **weights** come from the Muse DCP — everything else above is copied
verbatim into `<OUTPUT_DIR>/<STEP>/hf/` by the convert script.

## Usage

```bash
bash examples/pretrain/convert/convert_muse_to_hf.sh \
    /path/to/muse_outputs/1b6_sa_hybrid_8k \
    global_step5000 \
    examples/pretrain/hf_template
```