Hew LoRA for Qwen3.5-4B
A LoRA adapter that teaches Qwen3.5-4B to write valid Hew code โ a compiled, actor-based programming language that no LLM has seen in pretraining.
Results
80% compiler pass rate across 39 diverse eval prompts covering actors, supervisors, generators, state machines, wire types, pattern matching, algorithms, concurrency, extern FFI, and tests.
| Version | Samples | Config | Pass Rate |
|---|---|---|---|
| v7 | 1,827 | 1 epoch, r=16 | 53% |
| v8 | 1,866 | 1 epoch, r=16 | 69% |
| v9 | 1,898 | 1 epoch, r=16 | 68% |
| v10 | 1,926 | 1 epoch, r=16 | 67% |
| v11 | 1,926 | 2 epochs, r=16 | 80% |
| v12 | 1,926 | 1 epoch, r=32 | 71% |
This is v11 โ the best-performing adapter.
Usage
This is a GGUF LoRA adapter for use with llama.cpp. You need the base model in GGUF format.
# Download the base model (if you don't have it)
huggingface-cli download Qwen/Qwen3.5-4B --local-dir qwen3.5-4b
# Convert to GGUF (requires llama.cpp)
python3 convert_hf_to_gguf.py qwen3.5-4b --outtype f16 --outfile qwen3.5-4b-f16.gguf
# Run with LoRA adapter
llama-server -m qwen3.5-4b-f16.gguf --lora hew-lora-v11.gguf -ngl 99 -c 4096
Then query the OpenAI-compatible API:
curl http://localhost:8080/v1/chat/completions -d '{
"messages": [
{"role": "system", "content": "You are an expert Hew programmer. Write complete, correct Hew source code. Output ONLY the code."},
{"role": "user", "content": "Write a Hew actor that implements a counter with increment and get operations."}
],
"temperature": 0.7
}'
Training Details
- Base model: Qwen/Qwen3.5-4B (4.2B parameters)
- Method: LoRA (rank 16, alpha 32, dropout 0.05)
- Targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Trainable parameters: ~21M / 4.2B (0.5%)
- Training data: 1,926 compiler-validated Hew code samples in ChatML format
- Epochs: 2
- Precision: bfloat16 (no quantization during training)
- Hardware: AMD Radeon 780M iGPU (16 GB shared memory)
- Training time: ~3.5 hours per run
- Final loss: 0.163
- Token accuracy: 98.1%
What is Hew?
Hew is a compiled, actor-based programming language for building resilient services. It features:
- Actor isolation and message passing with move semantics
- Supervision trees for fault tolerance
- Wire types for serialization contracts
- Generators (
gen fn) with yield - State machines as first-class constructs
- Pattern matching
- Structured concurrency (select/join)
- Native compilation via LLVM
Training Approach
Every code sample in the training data is validated by the Hew compiler (hew check) before inclusion. The training loop is iterative: train, eval against 39 prompts, categorize compiler errors, write targeted correction examples, repeat.
Key findings:
- 2 epochs beat more data at this corpus size (~1,900 samples)
- Correction examples plateau quickly โ the first round of targeted fixes gave +16 points, subsequent rounds gave ~0
- The system prompt is training data โ encoding language rules explicitly in the ChatML system prompt helps the model learn them
- Pretraining priors are the main obstacle โ the model writes Rust-like code that looks right but uses wrong APIs, types, and syntax
For the full story, see Teaching an LLM a Language It Has Never Seen.
Limitations
- 20% of eval prompts still fail to compile (type errors, move semantics, prose leaking)
- The model sometimes outputs explanatory text instead of raw code
- Complex features (supervisors, concurrency patterns) have lower pass rates than simple algorithms
- Only tested with Qwen3.5-4B as the base model
- Downloads last month
- -
We're not able to determine the quantization variants.