GLM 4.7

by jpsequeira - opened Jan 25

Discussion

jpsequeira

Jan 25

Hey.

First thanks for your hard work.
Does this model play well with 4.7?
And have you considered a specific draft for 4.7?

jukofyork

Owner Jan 30

•

edited Jan 30

Hey.

First thanks for your hard work.
Does this model play well with 4.7?

I think it should work for 4.7 as the tokeniser seems the same?

And have you considered a specific draft for 4.7?

These aren't really trained on distilled data and just use a generic data mix (1/3 web scraped data, 1/3 instruction response data, 1/3 code data) to try to "heal" the tokeniser transplantation, so no real benefit to training a new draft unless the tokeniser changes.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment