ivnle/llamatales-gre-70b
Viewer • Updated • 2M • 43
From the paper 'Readability ≠ Learnability: Rethinking the Role of Simplicity in Training Small Language Models' (COLM 2025)
Note Short stories generated by `nvidia/Llama-3.1-Nemotron-70B-Instruct`.
Note Children's stories generated by `nvidia/Llama-3.1-Nemotron-70B-Instruct`.
Note Short stories generated by `meta-llama/Llama-3.1-8B-Instruct`.
Note Children's stories generated by `meta-llama/Llama-3.1-8B-Instruct`.
Note Source: https://huggingface.co/datasets/roneneldan/TinyStories/blob/main/TinyStories_all_data.tar.gz
Note 1B token sample of FineWeb-Edu https://huggingface.co/datasets/HuggingFaceFW/fineweb-edu Model checkpoints below. Naming format is [training data]-[layers]-[hidden size]-[heads]-[non-embedding parameter count].