TeichAI/GLM-4.7-Flash-Claude-Opus-4.5-High-Reasoning-Distill-GGUF

Jan 23

how did you all gather 1.3k downloads already? what was the record for downloads on a model?

TeichAI org Jan 23

Yes so far this model has been a great success! I believe the record for most downloads within 24 hours was the Qwen3-30B-A3B-Thinking-Gemini-2.5-Flash-Distill. This model got 11k overnight, but not many likes. Didn't turn out as well as I'd hoped, but this model was great. Learned the output style and behavior of Opus very quickly so I didn't really have to destroy the model with SFT. If I remember correctly the model only needed about 400 or 600 steps to converge

hugovoich

Feb 3

I'm writing "Hello" and it gives me a lecture about lilacs and why i should switch to American express card. Any idea why?

Feb 3

Probably a mix of not ideal settings (make sure it uses the recomended top p and temp), high reasoning distill (meaning it is prone to complex tasks and the ground model being for agentic tasks

Feb 3

personally i believe it is overfitting on the high reasoning which makes it expect only complex tasks and complex answers

TeichAI org Feb 3

personally i believe it is overfitting on the high reasoning which makes it expect only complex tasks and complex answers

Yea this is the most likely cause. even though it was done with low batch and only ~3 epochs the overfitting is apparent when you ask it more simple requests. Most of our models react like this to simple requests like "hi" I can counter this by including some more simple Q&A pairs into the dataset with low reasoning effort. In the future I will diversify the datasets more.

Flatmuffins

Feb 7

This model keeps rambling about absolute crap its like verbal diarrhea just random crap what are the best setting to use to get this working properly

https://unsloth.ai/docs/models/glm-4.7-flash#usage-guide

TeichAI org Feb 7

•

edited Feb 7

It's not uncommon to get a bad generation here or there, but these sampling params have generally led me to great responses

I personally set top_k to 50, but haven't played around much to see what exactly works best

e8root

Feb 15

•

edited Feb 15

Success is likely a result of the bold name which suggests it is some kind of distillation of Claude - which it is certainly not.
Truth is that in order to do proper distillation you need to have logits and do seriously long training which is incredibly expensive endeavor.
What this model is just very slight finetune on raw text extracted from Claude.

Feb 15

Your right that it will never have the raw capacity of Claude, it was just fine tuned to decompose and solve problems like Claude, which gets you a huge boost in performance, this is why Chain of Thought was so transformative for distillation. The point is not to make something that is an open source Claude, the point is to make something that is already good behave more like Claude and through Chain of Thought behaving like Claude makes it perform more like Claude.

Feb 15

I bet its something about the Unsloth notebook, which one did you use?

Feb 15

If it is overfitting just mix in your Convo dataset. That’s what I use.

TeichAI org Feb 15

Your right that it will never have the raw capacity of Claude, it was just fine tuned to decompose and solve problems like Claude, which gets you a huge boost in performance, this is why Chain of Thought was so transformative for distillation. The point is not to make something that is an open source Claude, the point is to make something that is already good behave more like Claude and through Chain of Thought behaving like Claude makes it perform more like Claude.

Exactly it's a reasoning distillation where we force the model into claude-like reasoning, not a knowledge distillation or a full scale distillation by any means

Feb 15

Exactly we are not making the next Phi

Feb 17

Btw check out my Datagen companion prompt generator PromptGen purpose built for TeichAI: https://github.com/bobthe144th/PromptGen. It takes the topic domains listed on the requests page and asks you to select the percentages
I created this on Saturday so it is constantly maintained, it is meant to run in the unsloth docker container. I will try to fix any bugs.

Feb 17

pretty sure they already have smth similar https://github.com/TeichAI/datagen

TeichAI org Feb 17

Btw check out my Datagen companion prompt generator PromptGen purpose built for TeichAI: https://github.com/bobthe144th/PromptGen. It takes the topic domains listed on the requests page and asks you to select the percentages
I created this on Saturday so it is constantly maintained, it is meant to run in the unsloth docker container. I will try to fix any bugs.

Will give it a look, thanks 😄

TeichAI org Feb 17

pretty sure they already have smth similar https://github.com/TeichAI/datagen

His tool seems to be a helper for generating the prompts.txt file

Feb 18

Exactly, it asks how many prompts you want to generate, asks for the percentage that each prompt topic from your TeichAI/Requests will get, generates that with Qwen 3 8B (you might want to switch it out with a bigger model for more advanced prompts), and formats it into a .md file with one prompt per line, as is required by your DataGen tool. On my 5070 Ti it took me about 8 minutes to generate 500 prompts and I am actively working on optimizations (I just implemented advanced prompt caching).

yadirhb

Feb 18

So we got new unsloth/GLM-4.7-Flash-GGUF model uploaded 5 days ago. Are we gonna get updates in this model too???

TeichAI org Feb 18

Yep, v2 coming soon :)

TeichAI org Feb 18

Yep, v2 coming soon :)

Edit: I dont see any major updates. perhaps it's just the gguf patches he made not a model tune