This is a test checkoint to do A/B comparison for the feature moe_disable_router when all calibration data is passed to each expert. Used 403 samples and 116K tokens, some from c4 other for reasoning and coding. This is an A model - was quantized with moe_disable_router=False (i.e. only routed experts receives calibration data).

Here is a side-by-side comparison for the aquarium prompt: https://avtc.github.io/aquarium-side-by-side/ (click to open a webpage)

The prompt:

Make an html animation of fishes in an aquarium. The aquarium is pretty, the fishes vary in colors and sizes and swim realistically. You can left click to place a piece of fish food in aquarium. Each fish chases a food piece closest to it, trying to eat it. Once there are no more food pieces, fishes resume swimming as usual. Enclose in ```html and ``` for proper md rendering

Downloads last month: 2

Safetensors

Model size

31B params

Tensor type

BF16

I32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avtc/Qwen3-Coder-30B-A3B-Instruct-GPTQODEL-W4A16-TEST-A

Base model

Qwen/Qwen3-Coder-30B-A3B-Instruct

Quantized

(135)

this model