Zhongzhu
/

OSCAR-RotationZoo

@@ -13,8 +13,9 @@ tags:
 Precomputed K/V rotation matrices for **OSCAR INT2 KV-cache quantization**.
-📄 Paper: *OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization*
-💻 Code: https://github.com/FutureMLS-Lab/OSCAR
 OSCAR captures Q/K/V activations on a small calibration set, estimates
 attention-aware K/V covariance offline, and derives per-layer orthogonal
@@ -30,6 +31,7 @@ re-run the Q/K/V dump and eigendecomposition yourself.
 | Model | Calibration | GPQA (BF16) | GPQA (OSCAR INT2) |
 |---|---|---:|---:|
 | `Qwen/Qwen3-4B-Thinking-2507` | `seq20000_prompt83_group128` | 67.27 | 67.17 |
 | `Qwen/Qwen3-8B`               | `seq20000_prompt83_group128` | 56.67 | 55.56 |
 | `Qwen/Qwen3-32B`              | `seq16000_prompt69_group128` | 58.49 | 60.40 |
 | `zai-org/GLM-4.7-FP8`         | `seq10000_prompt43_group128` | 73.23 | 73.57 |

 Precomputed K/V rotation matrices for **OSCAR INT2 KV-cache quantization**.
+- 📄 **Paper** — *OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization*
+- 🌐 **Website** — https://oscar-quantize.github.io/
+- 💻 **Code** — https://github.com/FutureMLS-Lab/OSCAR
 OSCAR captures Q/K/V activations on a small calibration set, estimates
 attention-aware K/V covariance offline, and derives per-layer orthogonal
 | Model | Calibration | GPQA (BF16) | GPQA (OSCAR INT2) |
 |---|---|---:|---:|
 | `Qwen/Qwen3-4B-Thinking-2507` | `seq20000_prompt83_group128` | 67.27 | 67.17 |
+| `Qwen/Qwen3-4B-Thinking-2507` | `seq20000_prompt85_group128` (fresh re-dump) | 67.27 | — |
 | `Qwen/Qwen3-8B`               | `seq20000_prompt83_group128` | 56.67 | 55.56 |
 | `Qwen/Qwen3-32B`              | `seq16000_prompt69_group128` | 58.49 | 60.40 |
 | `zai-org/GLM-4.7-FP8`         | `seq10000_prompt43_group128` | 73.23 | 73.57 |