HF Template
Populate this folder on the training machine with a working HF model snapshot
(Qwen3 + Summary Attention variant) before running
examples/pretrain/convert/convert_muse_to_hf.sh.
Expected contents
| File | Purpose |
|---|---|
config.json |
HF config with summary_* fields matching your trained model |
generation_config.json |
Default generation settings |
tokenizer.json / tokenizer_config.json / special_tokens_map.json |
Tokenizer |
vocab.json / merges.txt |
Tokenizer vocab (if applicable) |
modeling_qwen3*.py |
HF-compatible modeling code with SA support |
summary_context.py |
Helper module imported by the modeling code |
Only the weights come from the Muse DCP — everything else above is copied
verbatim into <OUTPUT_DIR>/<STEP>/hf/ by the convert script.
Usage
bash examples/pretrain/convert/convert_muse_to_hf.sh \
/path/to/muse_outputs/1b6_sa_hybrid_8k \
global_step5000 \
examples/pretrain/hf_template