No thinking in training datset?
#6
by jtvino - opened
Similar to https://huggingface.co/RedHatAI/gemma-4-31B-it-speculator.eagle3, I noticed that the training dataset includes no thinking examples.
Is there a reasoning behind this choice? Have you noticed if the acceptance rate for the thinking tokens is equivalent to the non thinking tokens?