Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-4bit
MLX-VLM conversion of huihui-ai/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated.
Overview
- Format:
MLX-VLM - Precision:
4bit - Size: about
19G - Quantization result:
4.649 bits/weight - Source model is preserved as a vision-language model for
mlx-vlm - Local validation passed for text generation, image generation, and abliterated behavior regression
Conversion Notes
This conversion keeps the model in the mlx-vlm layout and includes the compatibility fixes required for reliable use with MLX-VLM and LM Studio:
- restored a Qwen VL-compatible
chat_template.jinja - aligned
bos/eos/padtoken ids inconfig.json - preserved image and video prompt token handling
Validation
Local checks performed on Apple Silicon:
- text generation smoke test: passed
- image generation smoke test: passed
- abliterated regression set:
6/6non-refused refusal_rate = 0.0- eval run id:
20260317_195948 - median cleaned response length:
568chars - eval settings:
max_tokens=320,temperature=0.0,prefill_step_size=128
This is a behavior regression check, not a mathematical proof of equivalence.
Files
Important files in this repo:
config.jsonchat_template.jinjaprocessor_config.jsontokenizer.jsonmodel-00001-of-00004.safetensors...model-00004-of-00004.safetensorsmodel.safetensors.index.json
Usage
mlx-vlm text generation
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-4bit \
--prompt "你好" \
--max-tokens 256 \
--prefill-step-size 128
mlx-vlm image prompt
mlx_vlm.generate \
--model /path/to/Huihui-Qwen3.5-35B-A3B-Claude-4.6-Opus-abliterated-mlx-4bit \
--image /path/to/example.png \
--prompt "请简短描述这张图片。" \
--max-tokens 128 \
--prefill-step-size 128
LM Studio
This repo is intended to work as an MLX model in LM Studio after download or sync. The included chat template already contains the required Qwen vision tokens.
- Downloads last month
- 666
Model size
6B params
Tensor type
BF16
·
U32 ·
F32 ·
Hardware compatibility
Log In to add your hardware
4-bit