A collection of dynawords, target various languages
AI & ML interests
None defined yet.
Recent Activity
View all activity
[Unreleased] Datasets related to AI-Arenaen
-
danish-foundation-models/ai-arenaen-conversations
Viewer • Updated • 130 • 2 -
danish-foundation-models/ai-arenaen-reactions
Viewer • Updated • 32 • 2 -
danish-foundation-models/ai-arenaen-conversations-raw
Updated • 2 -
danish-foundation-models/ai-arenaen-votes
Viewer • Updated • 78 • 2
Papers related to Danish Foundation Models
Benchmarks for evaluating Danish Models.
-
EuroEval Leaderboard
📊7The robust European language model benchmark.
-
ScandEval: A Benchmark for Scandinavian Natural Language Processing
Paper • 2304.00906 • Published • 4 -
MTEB Leaderboard
🥇7.25kEmbedding Leaderboard
-
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Paper • 2406.02396 • Published
A collection of EuroEval compatible datasets which can be run using: `euroeval --dataset {dataset name} --model {model name}`
This is a collection of artifact released as a part of the paper: "Dynaword: From One-shot to Continuously Developed Datasets".
-
Dynaword: From One-shot to Continuously Developed Datasets
Paper • 2508.02271 • Published • 15 -
danish-foundation-models/danish-dynaword
Viewer • Updated • 11.3M • 5.57k • 18 -
danish-foundation-models/gemma-3-1b-cpt-dynaword-matched-v1
Text Generation • 1.0B • Updated • 2 -
danish-foundation-models/gemma-3-1b-scratch-dynaword-full-v1
Text Generation • 1.0B • Updated • 7
These include high-quality Danish text datasets for pre-training, fine-tuning, etc.
These models constitute state-of-the-art models for Danish within their respective domain (highlighted below the model).
-
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Updated • 533k • 1.36k -
google/gemma-3-27b-it
Image-Text-to-Text • 27B • Updated • 728k • • 1.95k -
google/gemma-3n-E4B-it
Image-Text-to-Text • Updated • 39.9k • • 900 -
google/gemma-2-9b-it
Text Generation • 9B • Updated • 328k • • 789
A collection of dynawords, target various languages
A collection of EuroEval compatible datasets which can be run using: `euroeval --dataset {dataset name} --model {model name}`
[Unreleased] Datasets related to AI-Arenaen
-
danish-foundation-models/ai-arenaen-conversations
Viewer • Updated • 130 • 2 -
danish-foundation-models/ai-arenaen-reactions
Viewer • Updated • 32 • 2 -
danish-foundation-models/ai-arenaen-conversations-raw
Updated • 2 -
danish-foundation-models/ai-arenaen-votes
Viewer • Updated • 78 • 2
This is a collection of artifact released as a part of the paper: "Dynaword: From One-shot to Continuously Developed Datasets".
-
Dynaword: From One-shot to Continuously Developed Datasets
Paper • 2508.02271 • Published • 15 -
danish-foundation-models/danish-dynaword
Viewer • Updated • 11.3M • 5.57k • 18 -
danish-foundation-models/gemma-3-1b-cpt-dynaword-matched-v1
Text Generation • 1.0B • Updated • 2 -
danish-foundation-models/gemma-3-1b-scratch-dynaword-full-v1
Text Generation • 1.0B • Updated • 7
Papers related to Danish Foundation Models
These include high-quality Danish text datasets for pre-training, fine-tuning, etc.
Benchmarks for evaluating Danish Models.
-
EuroEval Leaderboard
📊7The robust European language model benchmark.
-
ScandEval: A Benchmark for Scandinavian Natural Language Processing
Paper • 2304.00906 • Published • 4 -
MTEB Leaderboard
🥇7.25kEmbedding Leaderboard
-
The Scandinavian Embedding Benchmarks: Comprehensive Assessment of Multilingual and Monolingual Text Embedding
Paper • 2406.02396 • Published
These models constitute state-of-the-art models for Danish within their respective domain (highlighted below the model).
-
mistralai/Mistral-Small-3.1-24B-Instruct-2503
Updated • 533k • 1.36k -
google/gemma-3-27b-it
Image-Text-to-Text • 27B • Updated • 728k • • 1.95k -
google/gemma-3n-E4B-it
Image-Text-to-Text • Updated • 39.9k • • 900 -
google/gemma-2-9b-it
Text Generation • 9B • Updated • 328k • • 789