MLX-Qwopus3.5-9B-Coder-oQ4-fp16-mtp

oQ4 quantized MLX release of Qwopus3.5-9B-Coder optimized for Apple Silicon inference with Native MTP preserved.

Built with oMLX v0.3.9.dev2.

Quantization Details

Quantization method:

Non-quantized weight dtype:

float16

Enabled options:

Preserve MTP weights

This preserves:

mtp.* tensors
required config fields

allowing Native MTP to remain functional after quantization.

The resulting model includes the -mtp suffix accordingly.

Why float16?

float16 was selected instead of bfloat16 because Apple M1/M2 chips execute native fp16 especially efficiently during prefill workloads.

On Apple Silicon:

fp16 generally provides faster prompt ingestion
bf16 may offer slightly better numerical stability
M3/M4 systems may benefit more from bf16

For this release, the priority was maximum real-world inference responsiveness on M1/M2 hardware.

Tested Hardware

Device:

MacBook Pro M1
16GB unified memory

Runtime configuration:

Native MTP: enabled
Context window: 65536
Temperature: 1

Integrated into:

Hermes agent workflow

Observed performance:

Prompt processing (excluding cached): ~219.3 tok/s
Token generation: ~25.1 tok/s

Format

Format:

MLX safetensors

Designed specifically for:

Apple Silicon
MLX runtimes
Native MTP workflows

Compatibility

Tested with:

oMLX
LM Studio

Base Model

Base model by Jackrong:

Qwopus3.5-9B-Coder

All credit for the original architecture and training belongs to the upstream creators.

Notes

This release focuses on:

Apple Silicon efficiency
preserving Native MTP support
practical local coding-agent workflows
high context operation within 16GB unified memory constraints

Downloads last month: 970

Safetensors

Model size

2B params

Tensor type

F16

U32

MLX

Hardware compatibility

4-bit

Model tree for tongrow/MLX-Qwopus3.5-9B-Coder-oQ4-fp16-mtp

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

Qwen/Qwen3.5-9B

Finetuned

unsloth/Qwen3.5-9B

Finetuned

Jackrong/Qwopus3.5-9B-v3.5

Adapter

Jackrong/Qwopus3.5-9B-Coder

Quantized

(2)

this model