FOEM Quantization
Collection
• 20 items • Updated • 1
FOEM is an improved quantization method over GPTQ. The resulting model preserves the same inference structure as GPTQ, ensuring compatibility with existing deployment pipelines while achieving better accuracy.
We randomly sampled 512 examples from nohurry/Opus-4.6-Reasoning-3000x-filtered.
This model can be deployed using standard frameworks such as vLLM and SGLang, just like other GPTQModel-quantized models.
Base model
google/gemma-4-E4B-it