Update README.md
Browse files
README.md
CHANGED
|
@@ -24,6 +24,14 @@ Our models are trained on our German *Dense-Core* subset of FineWeb-2, utilizing
|
|
| 24 |
| [**Boldt-1B**](https://huggingface.co/Boldt/Boldt-1B) | 1B | 4096 | Extended context and vocabulary, augmented with 6B tokens of high-quality German news data. |
|
| 25 |
| [**Boldt-1B-IT-Preview**](https://huggingface.co/Boldt/Boldt-1B-IT-Preview) | 1B | 4096 | Instruction-tuned preview model for chat and zero-shot tasks. |
|
| 26 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 27 |
## ๐ Research & Artifacts
|
| 28 |
* **Paper:** [Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling (arXiv 2026)](https://arxiv.org/abs/2604.28075)
|
| 29 |
* **Evaluation Suite:** [Modernized German Benchmarks](https://huggingface.co/collections/Boldt/german-llm-benchmarks)
|
|
|
|
| 24 |
| [**Boldt-1B**](https://huggingface.co/Boldt/Boldt-1B) | 1B | 4096 | Extended context and vocabulary, augmented with 6B tokens of high-quality German news data. |
|
| 25 |
| [**Boldt-1B-IT-Preview**](https://huggingface.co/Boldt/Boldt-1B-IT-Preview) | 1B | 4096 | Instruction-tuned preview model for chat and zero-shot tasks. |
|
| 26 |
|
| 27 |
+
## ๐ Comparison
|
| 28 |
+
|
| 29 |
+
Boldt-1B compares favorable on German LLM benchmarks against other similarly-sized models:
|
| 30 |
+
|
| 31 |
+

|
| 32 |
+
|
| 33 |
+
It is even competitive with many larger (2B parameter) models. See our paper for the full evaluation.
|
| 34 |
+
|
| 35 |
## ๐ Research & Artifacts
|
| 36 |
* **Paper:** [Repetition over Diversity: High-Signal Data Filtering for Sample-Efficient German Language Modeling (arXiv 2026)](https://arxiv.org/abs/2604.28075)
|
| 37 |
* **Evaluation Suite:** [Modernized German Benchmarks](https://huggingface.co/collections/Boldt/german-llm-benchmarks)
|