continuum-ai
/

qwen3.5-4b-code-forged

EnricoFermi commited on Mar 30

Commit

f5f576d

verified ·

1 Parent(s): 44c62d8

Add GGUF Q4_K_M benchmark results (HumanEval 53.0%, HumanEval+ 47.0%)

Files changed (1) hide show

README.md CHANGED Viewed

@@ -62,6 +62,7 @@ The architecture co-evolves with training: heads that contribute to the domain s
 | Qwen2.5-Coder-3B | 3B | ~31% | — |
 | Phi-2 | 2.7B | 47.6% | — |
 | **qwen3.5-4b-code-forged** | **3.4B** | **57.3%** | **49.4%** |
 **+20% above Phi-2, +82% above StarCoder2-3B** in the sub-5B class.
@@ -69,6 +70,8 @@ The architecture co-evolves with training: heads that contribute to the domain s
 - **HumanEval+**: 49.4% pass@1 (81/164 base + extra tests)
 - **Method**: Greedy decoding (temperature 0), single sample, EvalPlus framework
 - **Hardware**: Evaluated as fp16 HuggingFace transformers on RTX 5090
 ## Runs On
@@ -76,6 +79,8 @@ The architecture co-evolves with training: heads that contribute to the domain s
 |--------|--------|----------|
 | MacBook Pro 16GB | fp16 | Yes |
 | MacBook Pro 32GB | fp16 | Yes |
 These models are designed for **consumer hardware**. No A100s required. Your MacBook, your gaming PC, your home server.

 | Qwen2.5-Coder-3B | 3B | ~31% | — |
 | Phi-2 | 2.7B | 47.6% | — |
 | **qwen3.5-4b-code-forged** | **3.4B** | **57.3%** | **49.4%** |
+| **qwen3.5-4b-code-forged Q4_K_M** | **2.6GB** | **53.0%** | **47.0%** |
 **+20% above Phi-2, +82% above StarCoder2-3B** in the sub-5B class.
 - **HumanEval+**: 49.4% pass@1 (81/164 base + extra tests)
 - **Method**: Greedy decoding (temperature 0), single sample, EvalPlus framework
 - **Hardware**: Evaluated as fp16 HuggingFace transformers on RTX 5090
+- **GGUF Q4_K_M**: 53.0% / 47.0% — only -4.3 points (7.5% relative drop from fp16)
+- **GGUF evaluated via**: llama-cpp-python on RTX 5090
 ## Runs On
 |--------|--------|----------|
 | MacBook Pro 16GB | fp16 | Yes |
 | MacBook Pro 32GB | fp16 | Yes |
+| RTX 5090 | GGUF Q4_K_M | Yes (HumanEval 53.0%) |
+| MacBook Pro M1 | GGUF Q4_K_M | Yes (llama.cpp Metal) |
 These models are designed for **consumer hardware**. No A100s required. Your MacBook, your gaming PC, your home server.