Image Classification
vision
ternary
quantization
vit
szymonrucinski commited on
Commit
dd50e51
·
verified ·
1 Parent(s): df30149

Add badges (arXiv, GitHub, HuggingFace, License, NeurIPS) and section icons

Browse files
Files changed (1) hide show
  1. README.md +13 -8
README.md CHANGED
@@ -14,16 +14,21 @@ datasets:
14
 
15
  # FTerViT: Fully Ternary Vision Transformer
16
 
 
 
 
 
 
 
17
  Pretrained checkpoints for **FTerViT** — the first fully ternary Vision Transformer where *all* weight matrices and normalization parameters are constrained to {-1, 0, +1}.
18
 
19
- **Paper:** [FTerViT: Fully Ternary Vision Transformer](https://arxiv.org/abs/XXXX.XXXXX) (NeurIPS 2026 submission)
20
- **Code:** [github.com/szymonrucinski/FTerViT](https://github.com/szymonrucinski/FTerViT)
21
 
22
- ## Key Results
23
 
24
  All models use **W2A8** (2-bit weights, 8-bit activations) with 100% ternary coverage — including patch embedding, LayerNorm, and classifier head.
25
 
26
- ### ImageNet-1K
27
 
28
  | Model | Phase | Epochs | Top-1 (%) | Binary (MB) | Compression | Checkpoint |
29
  |-------|-------|--------|-----------|-------------|-------------|------------|
@@ -32,14 +37,14 @@ All models use **W2A8** (2-bit weights, 8-bit activations) with 100% ternary cov
32
  | DeiT-Small | Phase 2 | +10 | **77.47** | 5.81 | 15.2x | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/imagenet1k/phase2_ep010_acc77.47_deit_small_224.pth) |
33
  | DeiT-III-Small | Phase 2 | +10 | **79.64** | 5.81 | 15.2x | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/imagenet1k/phase2_ep010_acc79.64_deit3_small_224.pth) |
34
 
35
- ### CIFAR-10 / CIFAR-100
36
 
37
  | Model | Dataset | Top-1 (%) | FP32 Baseline | Binary (MB) | Checkpoint |
38
  |-------|---------|-----------|---------------|-------------|------------|
39
  | DeiT-Tiny | CIFAR-10 | **97.43** | 97.52 | 1.53 | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/cifar10/phase2_ep010_acc97.43_deit_tiny_224.pth) |
40
  | DeiT-Tiny | CIFAR-100 | **86.01** | 86.54 | 1.53 | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/cifar100/phase2_ep010_acc86.01_deit_tiny_224.pth) |
41
 
42
- ## Training Protocol
43
 
44
  Training uses a two-phase knowledge distillation approach:
45
 
@@ -48,7 +53,7 @@ Training uses a two-phase knowledge distillation approach:
48
 
49
  See the paper for full details.
50
 
51
- ## Self-Contained Inference Example
52
 
53
  The code below loads and evaluates a FTerViT checkpoint **without any external dependencies beyond `torch`, `timm`, and `huggingface_hub`**. All ternary layer definitions are included inline.
54
 
@@ -243,7 +248,7 @@ print(f"Top-1 accuracy: {correct / total:.4f} ({correct / total * 100:.2f}%)")
243
  print(f"Evaluated {total} samples")
244
  ```
245
 
246
- ## Citation
247
 
248
  ```bibtex
249
  @inproceedings{rucinski2026ftervit,
 
14
 
15
  # FTerViT: Fully Ternary Vision Transformer
16
 
17
+ [![arXiv](https://img.shields.io/badge/arXiv-XXXX.XXXXX-B31B1B?style=for-the-badge&logo=arxiv&logoColor=white)](https://arxiv.org/abs/XXXX.XXXXX)
18
+ [![GitHub](https://img.shields.io/badge/GitHub-FTerViT-181717?style=for-the-badge&logo=github&logoColor=white)](https://github.com/szymonrucinski/FTerViT)
19
+ [![HuggingFace](https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-FTerViT-FFD21E?style=for-the-badge)](https://huggingface.co/szymonrucinski/FTerViT)
20
+ [![License](https://img.shields.io/badge/License-Apache%202.0-blue?style=for-the-badge)](https://opensource.org/licenses/Apache-2.0)
21
+ [![NeurIPS](https://img.shields.io/badge/NeurIPS-2026-purple?style=for-the-badge)](https://neurips.cc/)
22
+
23
  Pretrained checkpoints for **FTerViT** — the first fully ternary Vision Transformer where *all* weight matrices and normalization parameters are constrained to {-1, 0, +1}.
24
 
25
+ > **W2A8** · 2-bit weights · 8-bit activations · **100% ternary** · 15x compression · sub-6 MB models
 
26
 
27
+ ## 🏆 Key Results
28
 
29
  All models use **W2A8** (2-bit weights, 8-bit activations) with 100% ternary coverage — including patch embedding, LayerNorm, and classifier head.
30
 
31
+ ### 📊 ImageNet-1K
32
 
33
  | Model | Phase | Epochs | Top-1 (%) | Binary (MB) | Compression | Checkpoint |
34
  |-------|-------|--------|-----------|-------------|-------------|------------|
 
37
  | DeiT-Small | Phase 2 | +10 | **77.47** | 5.81 | 15.2x | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/imagenet1k/phase2_ep010_acc77.47_deit_small_224.pth) |
38
  | DeiT-III-Small | Phase 2 | +10 | **79.64** | 5.81 | 15.2x | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/imagenet1k/phase2_ep010_acc79.64_deit3_small_224.pth) |
39
 
40
+ ### 📊 CIFAR-10 / CIFAR-100
41
 
42
  | Model | Dataset | Top-1 (%) | FP32 Baseline | Binary (MB) | Checkpoint |
43
  |-------|---------|-----------|---------------|-------------|------------|
44
  | DeiT-Tiny | CIFAR-10 | **97.43** | 97.52 | 1.53 | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/cifar10/phase2_ep010_acc97.43_deit_tiny_224.pth) |
45
  | DeiT-Tiny | CIFAR-100 | **86.01** | 86.54 | 1.53 | [download](https://huggingface.co/szymonrucinski/FTerViT/resolve/main/cifar100/phase2_ep010_acc86.01_deit_tiny_224.pth) |
46
 
47
+ ## 🔧 Training Protocol
48
 
49
  Training uses a two-phase knowledge distillation approach:
50
 
 
53
 
54
  See the paper for full details.
55
 
56
+ ## 🚀 Self-Contained Inference Example
57
 
58
  The code below loads and evaluates a FTerViT checkpoint **without any external dependencies beyond `torch`, `timm`, and `huggingface_hub`**. All ternary layer definitions are included inline.
59
 
 
248
  print(f"Evaluated {total} samples")
249
  ```
250
 
251
+ ## 📝 Citation
252
 
253
  ```bibtex
254
  @inproceedings{rucinski2026ftervit,