No thinking in training datset?

by jtvino - opened 9 days ago

Similar to https://huggingface.co/RedHatAI/gemma-4-31B-it-speculator.eagle3, I noticed that the training dataset includes no thinking examples.
Is there a reasoning behind this choice? Have you noticed if the acceptance rate for the thinking tokens is equivalent to the non thinking tokens?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment