Spaces:
Sleeping
Sleeping
| title: Qwen2.5-7B-Instruct Borg Merge v1 | |
| emoji: 🤖 | |
| colorFrom: indigo | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 5.12.0 | |
| python_version: "3.12" | |
| app_file: app.py | |
| hardware: zero-a10g | |
| pinned: false | |
| license: apache-2.0 | |
| models: | |
| - Optitransfer/Qwen2.5-7B-Instruct-borg-merge-v1 | |
| short_description: "Chat with a 9-model cross-family merged checkpoint" | |
| # Qwen2.5-7B-Instruct -- Borg Merge v1 | |
| Chat with a model created by merging **9 models from 4 architecture families** (Llama, Phi, NeoX, OPT) into a single checkpoint using training-free cross-family weight merging. | |
| **Headline results**: +3.3 pp GSM8K, +3.2 pp ARC-Challenge, +2.6 pp IFEval over the unmerged anchor -- no fine-tuning and no distillation. | |
| [Model card](https://huggingface.co/Optitransfer/Qwen2.5-7B-Instruct-borg-merge-v1) | [Paper (SSRN)](https://ssrn.com/abstract=6545518) | [crdt-merge](https://github.com/mgillr/crdt-merge) | [Deep-dive article](https://medium.com/@rgillespie83/we-merged-9-models-from-4-architecture-families-into-one-and-it-beats-the-anchor-on-real-e6537dfa9252) | |