Hew LoRA for Qwen3.5-4B

A LoRA adapter that teaches Qwen3.5-4B to write valid Hew code — a compiled, actor-based programming language that no LLM has seen in pretraining.

Results

80% compiler pass rate across 39 diverse eval prompts covering actors, supervisors, generators, state machines, wire types, pattern matching, algorithms, concurrency, extern FFI, and tests.

Version	Samples	Config	Pass Rate
v7	1,827	1 epoch, r=16	53%
v8	1,866	1 epoch, r=16	69%
v9	1,898	1 epoch, r=16	68%
v10	1,926	1 epoch, r=16	67%
v11	1,926	2 epochs, r=16	80%
v12	1,926	1 epoch, r=32	71%

This is v11 — the best-performing adapter.

Usage

This is a GGUF LoRA adapter for use with llama.cpp. You need the base model in GGUF format.

# Download the base model (if you don't have it)
huggingface-cli download Qwen/Qwen3.5-4B --local-dir qwen3.5-4b

# Convert to GGUF (requires llama.cpp)
python3 convert_hf_to_gguf.py qwen3.5-4b --outtype f16 --outfile qwen3.5-4b-f16.gguf

# Run with LoRA adapter
llama-server -m qwen3.5-4b-f16.gguf --lora hew-lora-v11.gguf -ngl 99 -c 4096

Then query the OpenAI-compatible API:

curl http://localhost:8080/v1/chat/completions -d '{
  "messages": [
    {"role": "system", "content": "You are an expert Hew programmer. Write complete, correct Hew source code. Output ONLY the code."},
    {"role": "user", "content": "Write a Hew actor that implements a counter with increment and get operations."}
  ],
  "temperature": 0.7
}'

Training Details

Base model: Qwen/Qwen3.5-4B (4.2B parameters)
Method: LoRA (rank 16, alpha 32, dropout 0.05)
Targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Trainable parameters: ~21M / 4.2B (0.5%)
Training data: 1,926 compiler-validated Hew code samples in ChatML format
Epochs: 2
Precision: bfloat16 (no quantization during training)
Hardware: AMD Radeon 780M iGPU (16 GB shared memory)
Training time: ~3.5 hours per run
Final loss: 0.163
Token accuracy: 98.1%

What is Hew?

Hew is a compiled, actor-based programming language for building resilient services. It features:

Actor isolation and message passing with move semantics
Supervision trees for fault tolerance
Wire types for serialization contracts
Generators (gen fn) with yield
State machines as first-class constructs
Pattern matching
Structured concurrency (select/join)
Native compilation via LLVM

Training Approach

Every code sample in the training data is validated by the Hew compiler (hew check) before inclusion. The training loop is iterative: train, eval against 39 prompts, categorize compiler errors, write targeted correction examples, repeat.

Key findings:

2 epochs beat more data at this corpus size (~1,900 samples)
Correction examples plateau quickly — the first round of targeted fixes gave +16 points, subsequent rounds gave ~0
The system prompt is training data — encoding language rules explicitly in the ChatML system prompt helps the model learn them
Pretraining priors are the main obstacle — the model writes Rust-like code that looks right but uses wrong APIs, types, and syntax

For the full story, see Teaching an LLM a Language It Has Never Seen.

Limitations

20% of eval prompts still fail to compile (type errors, move semantics, prose leaking)
The model sometimes outputs explanatory text instead of raw code
Complex features (supervisors, concurrency patterns) have lower pass rates than simple algorithms
Only tested with Qwen3.5-4B as the base model

Downloads last month: -

GGUF

Model size

21.2M params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sleppistan/hew-lora-qwen3.5-4b

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Adapter

(99)

this model