mrgnw/gemma-4-e2b-svelte5

Gemma 4 E2B (5.1B total, 2.3B active MoE) fine-tuned for Svelte 5 component generation. Writes correct Svelte 5 syntax — $state, $derived, $props, onclick, {#snippet} — without falling back to Svelte 4 patterns.

125 tok/s on M4 Pro, 128K native context, 2.7 GB RAM (MLX 4-bit).

Proof of concept — created with help of an LLM and trained against SvelteBench (9 tasks). Performance outside those benchmarks is not guaranteed until we have broader training data and evaluation.

Formats

File	Format	Size	Use case
`model.safetensors.*`	MLX 4-bit	2.5 GB	Apple Silicon native (fastest)
`gemma-4-e2b-svelte5-Q4_K_M.gguf`	GGUF Q4_K_M	3.2 GB	LM Studio, llama.cpp, Ollama
`gemma-4-e2b-svelte5-bf16.gguf`	GGUF bf16	8.7 GB	Full precision GGUF

Use with MLX (Apple Silicon)

pip install mlx-lm

from mlx_lm import load, generate

model, tokenizer = load("mrgnw/gemma-4-e2b-svelte5")

messages = [{"role": "user", "content": "Build a searchable data table"}]
prompt = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, tokenize=False,
    enable_thinking=False,  # critical — prevents infinite thinking loops
)

response = generate(model, tokenizer, prompt=prompt, max_tokens=1024, verbose=True)

API server (OpenAI-compatible):

mlx_lm.server \
  --model mrgnw/gemma-4-e2b-svelte5 \
  --port 8199 \
  --chat-template-args '{"enable_thinking":false}'

Use with GGUF (LM Studio, llama.cpp, Ollama)

Download gemma-4-e2b-svelte5-Q4_K_M.gguf and load it in LM Studio or any llama.cpp-compatible tool.

Limitations

Trained on 9 SvelteBench tasks (2,880 cleaned samples) — component generation only
2.3B active parameters — writes components from prompts, doesn't architect apps
Must use enable_thinking=False in MLX chat template or model enters infinite reasoning loops
Best paired with a larger model for architecture + this model for component generation

Training

LoRA on mlx-community/gemma-4-e2b-it-4bit using mlx-lm 0.31.2. Rank 64, 32 layers, 3000 iterations, ~65 min on M4 Pro. Val loss 0.027.

Data cleaning was more important than LoRA rank — converted all on:click → onclick, stripped <svelte:options runes={true} />, removed Svelte 4 patterns from training data.

Downloads last month: 874

Safetensors

Model size

0.7B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for mrgnw/gemma-4-e2b-svelte5

Base model

mlx-community/gemma-4-e2b-it-4bit

Adapter

(2)

this model