GLM 4.7
#1
by jpsequeira - opened
Hey.
First thanks for your hard work.
Does this model play well with 4.7?
And have you considered a specific draft for 4.7?
Hey.
First thanks for your hard work.
Does this model play well with 4.7?
I think it should work for 4.7 as the tokeniser seems the same?
And have you considered a specific draft for 4.7?
These aren't really trained on distilled data and just use a generic data mix (1/3 web scraped data, 1/3 instruction response data, 1/3 code data) to try to "heal" the tokeniser transplantation, so no real benefit to training a new draft unless the tokeniser changes.