Will there be 3-4B, 7-9B, or even 13-20B variants? Maybe GDN/Mamba2/SWA/Lightning or MLA/MLRA or DeepSeek-like Sparse?

#4
by TomLucidor - opened

If HRM is so magical, I kinda want a demo site that I can try in the browser (like the BitNet debut). Also, a man can dream about agentic powers of smaller models getting crazier.

Sign up or log in to comment