Qwen3.5-122B-A10B
Collection
MINT quantized versions of Qwen3.5-122B-A10B at multiple budget targets (MLX & GGUF) • 4 items • Updated • 2
Mixed-precision quantized version of Qwen/Qwen3.5-122B-A10B optimised by baa.ai using a proprietary Black Sheep AI method.
Per-tensor bit-width allocation via advanced sensitivity analysis and budget-constrained optimisation — no calibration data required.
| Metric | Value |
|---|---|
| Size | 46 GB |
| Average bits | 3.0 |
| WikiText-2 PPL (median) | 5.8212 |
| Model | Size | PPL (median) |
|---|---|---|
| Uniform 4-bit | 65 GB | 5.605 |
| RAM 48GB | 46 GB | 5.821 |
| RAM 60GB | 57 GB | 5.500 |
| RAM 100GB | 93 GB | 5.477 |
| RAM 140GB | 126 GB | pending |
from mlx_lm import load, generate
model, tokenizer = load("baa-ai/Qwen3.5-122B-A10B-RAM-48GB-MLX")
response = generate(model, tokenizer, prompt="Hello!", max_tokens=256)
print(response)
Quantized by baa.ai
4-bit
Base model
Qwen/Qwen3.5-122B-A10B