Huihui-Qwen3.5-0.8B-abliterated-onnx
ONNX export of huihui-ai/Huihui-Qwen3.5-0.8B-abliterated for browser-side inference with WebGPU via transformers.js.
This is the base abliterated model without any LoRA or fine-tuning.
Build Process
- Base model:
huihui-ai/Huihui-Qwen3.5-0.8B-abliterated - ONNX: Weight transplant into reference graph structure from
onnx-community/Qwen3.5-0.8B-ONNX - Quantization: q8 (MatMul-only for decoder, full dynamic for embed/vision)
See ONNX_CONVERSION_GUIDE.md for detailed pipeline documentation.
Demo
Try it: bobber/routangseng-chat
Usage with transformers.js
import { Qwen3_5ForConditionalGeneration, AutoProcessor } from '@huggingface/transformers';
const model = await Qwen3_5ForConditionalGeneration.from_pretrained(
'bobber/Huihui-Qwen3.5-0.8B-abliterated-onnx',
{ dtype: { embed_tokens: 'q8', vision_encoder: 'q8', decoder_model_merged: 'q8' }, device: 'webgpu' }
);
Files
| File | Size | Description |
|---|---|---|
onnx/decoder_model_merged_quantized.onnx + .onnx_data |
~721 MB | Decoder (q8, MatMul-only) |
onnx/embed_tokens_quantized.onnx + .onnx_data |
~243 MB | Embeddings (q8) |
onnx/vision_encoder_quantized.onnx + .onnx_data |
~96 MB | Vision encoder (q8, from reference) |
Related
- With LoRA: bobber/routangseng-qwen35-0.8b-abliterated-lora-onnx
- Conversion guide: ONNX_CONVERSION_GUIDE.md
- Downloads last month
- 11
Model tree for bobber/Huihui-Qwen3.5-0.8B-abliterated-onnx
Base model
Qwen/Qwen3.5-0.8B-Base Finetuned
Qwen/Qwen3.5-0.8B