Main revision

by aldakata - opened Oct 7, 2025

Oct 7, 2025

Hi!
I was playing around with the revisions and I get different results with the main and the stage2-ingredient3-step23852-tokens51B revision.
Shouldn't these be the exact same model according to https://github.com/allenai/OLMo?

For the 1B model, we have trained three times with different data order on 50B high quality tokens, used last checkpoint of seed 42 as final checkpoint.

yukiwuki

29 days ago

•

edited 29 days ago

Hey @aldakata , I am facing the same issue now. Did you manage to figure out what's wrong?

aldakata

29 days ago

I couldn't figure it out, sorry

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment