Surprisingly not overcooked

#1
by toyelit736 - opened

Whenever a model has been extensively post-trained to increase intelligence a concern of mine is whether all that training will induce catastrophic forgetting. I tested this model against the original Llama 3.1 405B Instruct with a quiz consisting of obscure trivia ("tail knowledge") that I put together myself. For the most part the two models got the same questions right and same questions wrong, but there were a few questions where this model was able to produce the right answer but the original Instruct model was not.

Sign up or log in to comment