File size: 1,010 Bytes
a4e273f | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | # HF Template
Populate this folder on the training machine with a working HF model snapshot
(Qwen3 + Summary Attention variant) **before** running
`examples/pretrain/convert/convert_muse_to_hf.sh`.
## Expected contents
| File | Purpose |
|---|---|
| `config.json` | HF config with `summary_*` fields matching your trained model |
| `generation_config.json` | Default generation settings |
| `tokenizer.json` / `tokenizer_config.json` / `special_tokens_map.json` | Tokenizer |
| `vocab.json` / `merges.txt` | Tokenizer vocab (if applicable) |
| `modeling_qwen3*.py` | HF-compatible modeling code with SA support |
| `summary_context.py` | Helper module imported by the modeling code |
Only the **weights** come from the Muse DCP — everything else above is copied
verbatim into `<OUTPUT_DIR>/<STEP>/hf/` by the convert script.
## Usage
```bash
bash examples/pretrain/convert/convert_muse_to_hf.sh \
/path/to/muse_outputs/1b6_sa_hybrid_8k \
global_step5000 \
examples/pretrain/hf_template
```
|