Replacing REAP with REAM

#1
by TomLucidor - opened

Nice to see a small model like Nemotron-3-Nano made to be smaller and faster, but I wonder if there are ways for models to preserve as much quality as possible https://bknyaz.github.io/blog/2026/moe/
Bonus ask: Kimi-Linear, Nemotron-H series, GPT-OSS series, GLM-4.5-Air, Ring-V2/Ming-V2/Ling-V2 series, Seed-OSS, StepFun, and probably more deserves to be REAM-ed

I was actually just reading about REAM! It does seem very intriguing, and I might try it with Nemotron. However, keep in mind I definitely won't be able to do all of the models you list, I was relying on Modal 's free credits to do just this one REAP. If you could possibly narrow it down, I could definitely try!

Priority: Kimi-Linear (48B AND Linear), GPT-OSS-20B (good reasoner), GLM-4.7-Flash (30B), LongCat-Flash-Lite (see if embedding would cause issues), Nemotron-3-Nano (30B AND Linear), Ring-Mini-Linear-2.0 (16B AND Linear)
Optional: GPT-OSS-120B (OpenAI gift), GLM-4.5-AIr (106B and near-SOTA), DeepSeek-v2-lite (16B and legacy), Ring-Flash-Linear-2.0 (104B AND Linear)
Excessive: GLM-4.X/5, Qwen3-Coder, DeepSeek, Seed, Qwen3-235B, StepFun models, Ring-1T, Kimi-K2, MiniMax-M1, MiniMax-M2
Probably need to also note down Gemma, Granite-4.0-H, and whatever Mistral have on the list as well

Some of those models are pretty big, but I'll see what I can do, no guarantees that I'll be able to do all of the priorities!

Also, remember it can't do dense models, so Llama 3.x and basic Mistral models won't work!

Oh wait Llama 4 (the bad ones) were the ones with MoE, I forgot! Damn that was a long time ago. Mistral definitely has a lot of MoE models tho!
Also Kimi made Moonlight (16B) and I am surprised they made something on the consumer range!

Yes, but the only Mistral MoE models are Mixtral (super old) and Mistral Large 3 (too large) Moonlight might work, but it's also quite old. Also, I just now realized you were the one who wrote the reddit post that taught me about REAM, what a small community! Anyways, I'm busy and won't be able to work on it anytime soon, so patience is appreciated.

Sign up or log in to comment