Main revision

#5
by aldakata - opened

Hi!
I was playing around with the revisions and I get different results with the main and the stage2-ingredient3-step23852-tokens51B revision.
Shouldn't these be the exact same model according to https://github.com/allenai/OLMo?

For the 1B model, we have trained three times with different data order on 50B high quality tokens, used last checkpoint of seed 42 as final checkpoint.

Hey @aldakata , I am facing the same issue now. Did you manage to figure out what's wrong?

I couldn't figure it out, sorry

Sign up or log in to comment