Harley-ml
/

Tenete-8M

Text Generation

Eval Results (legacy)

Model card Files Files and versions

Harley-ml commited on 21 days ago

Commit

dac471b

·

verified ·

1 Parent(s): f5246b9

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -571,6 +571,7 @@ We decided to evaluate the model on each source it trained on to see the differe
 The model achieves random or near-random on most tasks, which is expected. An 8M parameter model cannot store world-level knowledge or thoroughly reason.
 ### Coherency Benchmark
 To evaluate the **coherency, factuality, and fluency** of our (and other) models, we use **Qwen3-32B** to grade **300 different generations** generated from an **unconditional prompt**.

 The model achieves random or near-random on most tasks, which is expected. An 8M parameter model cannot store world-level knowledge or thoroughly reason.
+Note: The full breakdown (LM Harness Output) is right [here](https://huggingface.co/Harley-ml/Tenete-8M/blob/main/raw_lmharness_eval_output.txt)
 ### Coherency Benchmark
 To evaluate the **coherency, factuality, and fluency** of our (and other) models, we use **Qwen3-32B** to grade **300 different generations** generated from an **unconditional prompt**.