Hew LoRA for Qwen3.5-4B

A LoRA adapter that teaches Qwen3.5-4B to write valid Hew code โ€” a compiled, actor-based programming language that no LLM has seen in pretraining.

Results

80% compiler pass rate across 39 diverse eval prompts covering actors, supervisors, generators, state machines, wire types, pattern matching, algorithms, concurrency, extern FFI, and tests.

Version Samples Config Pass Rate
v7 1,827 1 epoch, r=16 53%
v8 1,866 1 epoch, r=16 69%
v9 1,898 1 epoch, r=16 68%
v10 1,926 1 epoch, r=16 67%
v11 1,926 2 epochs, r=16 80%
v12 1,926 1 epoch, r=32 71%

This is v11 โ€” the best-performing adapter.

Usage

This is a GGUF LoRA adapter for use with llama.cpp. You need the base model in GGUF format.

# Download the base model (if you don't have it)
huggingface-cli download Qwen/Qwen3.5-4B --local-dir qwen3.5-4b

# Convert to GGUF (requires llama.cpp)
python3 convert_hf_to_gguf.py qwen3.5-4b --outtype f16 --outfile qwen3.5-4b-f16.gguf

# Run with LoRA adapter
llama-server -m qwen3.5-4b-f16.gguf --lora hew-lora-v11.gguf -ngl 99 -c 4096

Then query the OpenAI-compatible API:

curl http://localhost:8080/v1/chat/completions -d '{
  "messages": [
    {"role": "system", "content": "You are an expert Hew programmer. Write complete, correct Hew source code. Output ONLY the code."},
    {"role": "user", "content": "Write a Hew actor that implements a counter with increment and get operations."}
  ],
  "temperature": 0.7
}'

Training Details

  • Base model: Qwen/Qwen3.5-4B (4.2B parameters)
  • Method: LoRA (rank 16, alpha 32, dropout 0.05)
  • Targets: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Trainable parameters: ~21M / 4.2B (0.5%)
  • Training data: 1,926 compiler-validated Hew code samples in ChatML format
  • Epochs: 2
  • Precision: bfloat16 (no quantization during training)
  • Hardware: AMD Radeon 780M iGPU (16 GB shared memory)
  • Training time: ~3.5 hours per run
  • Final loss: 0.163
  • Token accuracy: 98.1%

What is Hew?

Hew is a compiled, actor-based programming language for building resilient services. It features:

  • Actor isolation and message passing with move semantics
  • Supervision trees for fault tolerance
  • Wire types for serialization contracts
  • Generators (gen fn) with yield
  • State machines as first-class constructs
  • Pattern matching
  • Structured concurrency (select/join)
  • Native compilation via LLVM

Training Approach

Every code sample in the training data is validated by the Hew compiler (hew check) before inclusion. The training loop is iterative: train, eval against 39 prompts, categorize compiler errors, write targeted correction examples, repeat.

Key findings:

  • 2 epochs beat more data at this corpus size (~1,900 samples)
  • Correction examples plateau quickly โ€” the first round of targeted fixes gave +16 points, subsequent rounds gave ~0
  • The system prompt is training data โ€” encoding language rules explicitly in the ChatML system prompt helps the model learn them
  • Pretraining priors are the main obstacle โ€” the model writes Rust-like code that looks right but uses wrong APIs, types, and syntax

For the full story, see Teaching an LLM a Language It Has Never Seen.

Limitations

  • 20% of eval prompts still fail to compile (type errors, move semantics, prose leaking)
  • The model sometimes outputs explanatory text instead of raw code
  • Complex features (supervisors, concurrency patterns) have lower pass rates than simple algorithms
  • Only tested with Qwen3.5-4B as the base model
Downloads last month
-
GGUF
Model size
21.2M params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for sleppistan/hew-lora-qwen3.5-4b

Finetuned
Qwen/Qwen3.5-4B
Adapter
(99)
this model