Small error in the model overview section of the README

#13
by Luca3700 - opened

Hi,
I would like to signal a small error in the above mentioned section.

It is reported "Hidden Layout: 16 Γ— (3 Γ— (Gated DeltaNet β†’ MoE) β†’ 1 Γ— (Gated Attention β†’ MoE))",
but I think it should be "12x" instead of "16x", since the total number of layer is 48 (as also stated in the config.json file).

Thank you for your awesome work

thank you too

jklj077 changed discussion status to closed

Sign up or log in to comment