This is a test checkoint to do A/B comparison for the feature moe_disable_router when all calibration data is passed to each expert.
Used 403 samples and 116K tokens, some from c4 other for reasoning and coding.
This is an A model - was quantized with moe_disable_router=False (i.e. only routed experts receives calibration data).
Here is a side-by-side comparison for the aquarium prompt: https://avtc.github.io/aquarium-side-by-side/ (click to open a webpage)
The prompt:
Make an html animation of fishes in an aquarium. The aquarium is pretty, the fishes vary in colors and sizes and swim realistically. You can left click to place a piece of fish food in aquarium. Each fish chases a food piece closest to it, trying to eat it. Once there are no more food pieces, fishes resume swimming as usual. Enclose in ```html and ``` for proper md rendering
- Downloads last month
- 2
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for avtc/Qwen3-Coder-30B-A3B-Instruct-GPTQODEL-W4A16-TEST-A
Base model
Qwen/Qwen3-Coder-30B-A3B-Instruct