Question concerning your use of Grok 3 in the name

by Drafvan - opened Jul 9, 2025

Jul 9, 2025

I don't see xAI listings for Grok 3 basically anywhere on this site. Grok 1 yes, but that's it. So why add that name to your model?

reedmayhew

Owner Jul 11, 2025

•

edited Jan 19

This is a distillation/fine-tuning of the Grok 3 model. That’s why it says gemma3-12B-distilled at the end. It means Grok 3 outputs were used as the dataset to fine-tune and distill into the gemma3-12B model.

So the Grok 3 model wasn’t just referenced, it was directly used as the source for the fine-tuning data. The result is a distilled version of Grok 3 behavior running on gemma3-12B.

This model-naming practice is standard, similar to distilled versions of DeepSeek R1.
For example: deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B. The model is not actually R1, it is Qwen-1.5B fine-tuned off of R1. Same practice here.

reedmayhew changed discussion status to closed Jul 11, 2025

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment