mrgnw/gemma-4-e2b-svelte5
Gemma 4 E2B (5.1B total, 2.3B active MoE) fine-tuned for Svelte 5 component generation. Writes correct Svelte 5 syntax — $state, $derived, $props, onclick, {#snippet} — without falling back to Svelte 4 patterns.
125 tok/s on M4 Pro, 128K native context, 2.7 GB RAM (MLX 4-bit).
Proof of concept — created with help of an LLM and trained against SvelteBench (9 tasks). Performance outside those benchmarks is not guaranteed until we have broader training data and evaluation.
Formats
| File | Format | Size | Use case |
|---|---|---|---|
model.safetensors.* |
MLX 4-bit | 2.5 GB | Apple Silicon native (fastest) |
gemma-4-e2b-svelte5-Q4_K_M.gguf |
GGUF Q4_K_M | 3.2 GB | LM Studio, llama.cpp, Ollama |
gemma-4-e2b-svelte5-bf16.gguf |
GGUF bf16 | 8.7 GB | Full precision GGUF |
Use with MLX (Apple Silicon)
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("mrgnw/gemma-4-e2b-svelte5")
messages = [{"role": "user", "content": "Build a searchable data table"}]
prompt = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, tokenize=False,
enable_thinking=False, # critical — prevents infinite thinking loops
)
response = generate(model, tokenizer, prompt=prompt, max_tokens=1024, verbose=True)
API server (OpenAI-compatible):
mlx_lm.server \
--model mrgnw/gemma-4-e2b-svelte5 \
--port 8199 \
--chat-template-args '{"enable_thinking":false}'
Use with GGUF (LM Studio, llama.cpp, Ollama)
Download gemma-4-e2b-svelte5-Q4_K_M.gguf and load it in LM Studio or any llama.cpp-compatible tool.
Limitations
- Trained on 9 SvelteBench tasks (2,880 cleaned samples) — component generation only
- 2.3B active parameters — writes components from prompts, doesn't architect apps
- Must use
enable_thinking=Falsein MLX chat template or model enters infinite reasoning loops - Best paired with a larger model for architecture + this model for component generation
Training
LoRA on mlx-community/gemma-4-e2b-it-4bit using mlx-lm 0.31.2. Rank 64, 32 layers, 3000 iterations, ~65 min on M4 Pro. Val loss 0.027.
Data cleaning was more important than LoRA rank — converted all on:click → onclick, stripped <svelte:options runes={true} />, removed Svelte 4 patterns from training data.
- Downloads last month
- 874
4-bit
Model tree for mrgnw/gemma-4-e2b-svelte5
Base model
mlx-community/gemma-4-e2b-it-4bit