Qwen3.5-0.8B Mermaid Diagram Generator - ONNX (FP16 Quantized)
ONNX version of fine-tuned Qwen3.5-0.8B Mermaid diagram generator with FP16 quantization for optimized browser inference.
Model Details
- Base Model: Qwen/Qwen3.5-0.8B (0.8B parameters)
- Format: ONNX with FP16 quantization
- Purpose: Browser deployment via Transformers.js and WebGPU
- Dataset: SpongeBOB9684/mermaid-text-to-diagram
Conversion
This ONNX model was converted from PyTorch fine-tuned model:
- Framework: Optimum
- Quantization: FP16 (16-bit floating point)
- Compression: ~50% smaller than FP32
- Compatibility: Transformers.js with ONNX Runtime WebGPU
Usage (Browser)
import { pipeline } from '@xenova/transformers';
// Create pipeline
const generator = await pipeline('text-generation', 'SpongeBOB9684/qwen3.5-0.8b-mermaid-generator-onnx', {
dtype: 'q4', // quantized model
device: 'webgpu',
});
// Generate
const prompt = 'Create a flowchart for a simple login process';
const messages = [
{ role: 'system', content: 'You are a Mermaid diagram code generator. Output ONLY valid Mermaid code.' },
{ role: 'user', content: prompt },
];
const output = await generator(messages);
console.log(output);
Performance
- Model Size: ~1.3 GB (FP16 quantized)
- Load Time: < 5 seconds on typical browsers
- Inference Speed: ~15-30 tokens/second (depends on hardware)
- Memory: ~1.3 GB GPU memory (with quantization)
Advantages
FP16 Quantization:
- ~50% smaller model size
- Faster inference with minimal quality loss
- Lower memory usage for browser deployment
ONNX Format:
- Optimized for web inference
- Cross-platform compatibility
- Direct loading in Transformers.js
Limitations
- FP16 quantization may cause slight precision differences compared to FP32
- Best results with clear, specific prompts
- Limited to Mermaid syntax (not general diagram description)
License
Apache 2.0
Acknowledgments
- Base model: Qwen3.5-0.8B by Alibaba Cloud
- ONNX conversion: Optimum library
- Dataset: SpongeBOB9684/mermaid-text-to-diagram
- Training framework: Unsloth
- Downloads last month
- 189
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support