No thinking in training datset?

#6
by jtvino - opened

Similar to https://huggingface.co/RedHatAI/gemma-4-31B-it-speculator.eagle3, I noticed that the training dataset includes no thinking examples.
Is there a reasoning behind this choice? Have you noticed if the acceptance rate for the thinking tokens is equivalent to the non thinking tokens?

Sign up or log in to comment