Qwen3.5-0.8B Mermaid Diagram Generator
A fine-tuned version of Qwen3.5-0.8B specialized for generating valid Mermaid 11.14 diagrams from natural language descriptions.
Model Details
- Base Model: Qwen/Qwen3.5-0.8B (0.8B parameters)
- Training Method: LoRA fine-tuning with Unsloth
- Dataset: SpongeBOB9684/mermaid-text-to-diagram
- Context Length: 32,768 tokens
- Training Examples: 9,913 validated Mermaid 11.14 examples
Usage
Installation
pip install transformers torch
Inference
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "SpongeBOB9684/qwen3.5-0.8b-mermaid-generator"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
prompt = "Create a flowchart for a simple login process"
messages = [
{"role": "system", "content": "You are a Mermaid diagram code generator. Output ONLY valid Mermaid code."},
{"role": "user", "content": prompt},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7, top_p=0.8)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Supported Diagram Types
The model is trained to generate:
- Flowcharts (
flowchart) - Sequence diagrams (
sequenceDiagram) - Class diagrams (
classDiagram) - State diagrams (
stateDiagram-v2) - ER diagrams (
erDiagram) - Gantt charts (
gantt) - Mind maps (
mindmap) - Pie charts (
pie) - Git graphs (
gitGraph)
Model Capabilities
- Syntax Validation: All training examples validated with mermaid-cli
- Mermaid 11.14: Full support for latest syntax features
- Complexity: Handles simple to very complex diagrams (>25 nodes)
- Features: Subgraphs, styling, multi-directional arrows, markdown in nodes
Training Details
Dataset
This model was trained on a dataset of 9,913 validated examples sourced from three repositories:
Source Distribution:
Celiadraw/text-to-mermaid (8,912 examples, ~90%): Existing data
- https://huggingface.co/datasets/Celiadraw/text-to-mermaid
- Non-LLM generated
djds4rce/mermaid-synthetic (926 examples, ~9%): Existing data
- https://huggingface.co/datasets/djds4rce/mermaid-synthetic
- MIT License
- Non-LLM generated
Edge cases (75 examples, ~1%): LLM generated for Mermaid 11.14 coverage
- Created to cover missing Mermaid 11.14 features
- Strictly validated syntactically before inclusion
Splits:
- Train: 80% of examples
- Validation: 10% of examples
- Test: 10% of examples
Diagram Distribution:
- 54.7% flowcharts, 15.1% sequence, 10.4% class, etc.
Training Configuration
- Framework: Unsloth (2x faster, 70% less VRAM)
- Method: LoRA (0.1% trainable parameters)
- Precision: FP16
- Hardware: Trained on local GPU
Methodology
Validation
All training examples were validated using mermaid-cli (@mermaid-js/mermaid-cli) to ensure:
- ✅ Correct Mermaid 11.14 syntax
- ✅ Successful rendering
- ✅ Conformity with official specifications
Data Sources
Existing Data (~99% of dataset)
- Sourced from Celiadraw and djds4rce
- Used as-is, no LLM modifications
- Upgraded to Mermaid 11.14 where needed
LLM Generated Edge Cases (~1% of dataset)
- Generated specifically to cover missing Mermaid 11.14 features
- Validated with same rigor as existing data
- Covers edge cases and complex scenarios
Model Architecture
- Layers: 24
- Hidden Size: 2048
- Attention Heads: 16
- Vocabulary: 151,936 tokens
Documentation
This model is trained to generate code conforming to Mermaid 11.14.0:
- 📖 Documentation: https://mermaid.js.org/intro/
- 🔧 Syntax: https://mermaid.js.org/syntax/
- ✅ All training examples validated against official specs
Limitations
- Model may occasionally generate diagrams that require minor syntax adjustments
- Best results with clear, specific prompts
- Limited to Mermaid syntax (not general diagram description)
Browser Deployment
This model is designed for in-browser inference via Transformers.js and WebGPU, enabling client-side Mermaid generation without server API calls.
License
Apache 2.0
Acknowledgments
- Base model: Qwen3.5-0.8B by Alibaba Cloud
- Training framework: Unsloth
- Dataset sources:
- Celiadraw/text-to-mermaid (existing data)
- djds4rce/mermaid-synthetic (existing data, MIT License)
- Edge cases generated by Mermaid Studio project
- Downloads last month
- 21