BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper • 2402.10631 • Published • 2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 7.87 | 67.09 ± 0.96 | 33.02 ± 1.37 | 74.05 ± 1.02 | 61.64 ± 1.37 | 48.79 ± 0.50 | - | 56.92 |
Training method based on BitDistiller Paper
Base model
meta-llama/Llama-2-7b-hf