EnricoFermi commited on
Commit
f5f576d
·
verified ·
1 Parent(s): 44c62d8

Add GGUF Q4_K_M benchmark results (HumanEval 53.0%, HumanEval+ 47.0%)

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -62,6 +62,7 @@ The architecture co-evolves with training: heads that contribute to the domain s
62
  | Qwen2.5-Coder-3B | 3B | ~31% | — |
63
  | Phi-2 | 2.7B | 47.6% | — |
64
  | **qwen3.5-4b-code-forged** | **3.4B** | **57.3%** | **49.4%** |
 
65
 
66
  **+20% above Phi-2, +82% above StarCoder2-3B** in the sub-5B class.
67
 
@@ -69,6 +70,8 @@ The architecture co-evolves with training: heads that contribute to the domain s
69
  - **HumanEval+**: 49.4% pass@1 (81/164 base + extra tests)
70
  - **Method**: Greedy decoding (temperature 0), single sample, EvalPlus framework
71
  - **Hardware**: Evaluated as fp16 HuggingFace transformers on RTX 5090
 
 
72
 
73
  ## Runs On
74
 
@@ -76,6 +79,8 @@ The architecture co-evolves with training: heads that contribute to the domain s
76
  |--------|--------|----------|
77
  | MacBook Pro 16GB | fp16 | Yes |
78
  | MacBook Pro 32GB | fp16 | Yes |
 
 
79
 
80
  These models are designed for **consumer hardware**. No A100s required. Your MacBook, your gaming PC, your home server.
81
 
 
62
  | Qwen2.5-Coder-3B | 3B | ~31% | — |
63
  | Phi-2 | 2.7B | 47.6% | — |
64
  | **qwen3.5-4b-code-forged** | **3.4B** | **57.3%** | **49.4%** |
65
+ | **qwen3.5-4b-code-forged Q4_K_M** | **2.6GB** | **53.0%** | **47.0%** |
66
 
67
  **+20% above Phi-2, +82% above StarCoder2-3B** in the sub-5B class.
68
 
 
70
  - **HumanEval+**: 49.4% pass@1 (81/164 base + extra tests)
71
  - **Method**: Greedy decoding (temperature 0), single sample, EvalPlus framework
72
  - **Hardware**: Evaluated as fp16 HuggingFace transformers on RTX 5090
73
+ - **GGUF Q4_K_M**: 53.0% / 47.0% — only -4.3 points (7.5% relative drop from fp16)
74
+ - **GGUF evaluated via**: llama-cpp-python on RTX 5090
75
 
76
  ## Runs On
77
 
 
79
  |--------|--------|----------|
80
  | MacBook Pro 16GB | fp16 | Yes |
81
  | MacBook Pro 32GB | fp16 | Yes |
82
+ | RTX 5090 | GGUF Q4_K_M | Yes (HumanEval 53.0%) |
83
+ | MacBook Pro M1 | GGUF Q4_K_M | Yes (llama.cpp Metal) |
84
 
85
  These models are designed for **consumer hardware**. No A100s required. Your MacBook, your gaming PC, your home server.
86