🗺️ Quantization Method

FOEM (AAAI 2026)

FOEM is an improved quantization method over GPTQ. The resulting model preserves the same inference structure as GPTQ, ensuring compatibility with existing deployment pipelines while achieving better accuracy.

📚 Calibration Dataset

We randomly sampled 512 examples from nohurry/Opus-4.6-Reasoning-3000x-filtered.

📋 Usage Example

This model can be deployed using standard frameworks such as vLLM and SGLang, just like other GPTQModel-quantized models.

Downloads last month: 1,692

Safetensors

Model size

5B params

Tensor type

BF16

I32

Model tree for Xingyu-Zheng/gemma-4-E2B-it-int4-foem

Base model

google/gemma-4-E2B-it

Quantized

(130)

this model

Dataset used to train Xingyu-Zheng/gemma-4-E2B-it-int4-foem

Collection including Xingyu-Zheng/gemma-4-E2B-it-int4-foem

FOEM Quantization

Collection

• 20 items • Updated about 10 hours ago • 1

Paper for Xingyu-Zheng/gemma-4-E2B-it-int4-foem

First-Order Error Matters: Accurate Compensation for Quantized Large Language Models

Paper • 2507.11017 • Published Nov 14, 2025 • 1