Add pipeline tag and improve model card documentation (#1)

- Add pipeline tag and improve model card documentation (c699f6735ebbf5b16778bbecea7ba99d7124f2aa)

Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>

Files changed (1) hide show

README.md +61 -3

README.md CHANGED Viewed

@@ -1,13 +1,71 @@
 ---
 license: mit
 ---
 # Quantized Visual Geometry Grounded Transformer
 [![arXiv](https://img.shields.io/badge/QuantVGGT-2509.21302-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2509.21302)
-This repository contains the model zoo of QuantVGGT: [Quantized Visual Geometry Grounded Transformer](https://arxiv.org/abs/2509.21302).
-The official code can be found at [https://github.com/wlfeng0509/QuantVGGT](https://github.com/wlfeng0509/QuantVGGT).
----

 ---
 license: mit
+pipeline_tag: image-to-3d
 ---
 # Quantized Visual Geometry Grounded Transformer
 [![arXiv](https://img.shields.io/badge/QuantVGGT-2509.21302-b31b1b.svg?logo=arXiv)](https://arxiv.org/abs/2509.21302)
+[![GitHub](https://img.shields.io/badge/GitHub-Code-blue?style=flat-square&logo=github)](https://github.com/wlfeng0509/QuantVGGT)
+This repository contains the weights and calibration data for **QuantVGGT**, presented in the paper [Quantized Visual Geometry Grounded Transformer](https://arxiv.org/abs/2509.21302).
+QuantVGGT is the first quantization framework specifically designed for Visual Geometry Grounded Transformers (VGGTs). It addresses unique challenges in compressing billion-scale 3D reconstruction models, such as heavy-tailed activation distributions and multi-view calibration instability.
+## Installation
+To get started, clone the official repository and install the dependencies:
+```bash
+git clone https://github.com/wlfeng0509/QuantVGGT.git
+cd QuantVGGT
+pip install -r requirements.txt
+pip install -r requirements_demo.txt
+```
+## Quick Start
+You can use the provided scripts for inference and calibration. For example, to generate filtered Co3D calibration data:
+```bash
+python Quant_VGGT/vggt/evaluation/make_calibation.py \
+    --model_path VGGT-1B/model_tracker_fixed_e20.pt \
+    --co3d_dir co3d_datasets/ \
+    --co3d_anno_dir co3d_v2_annotations/ \
+    --seed 0 \
+    --cache_path all_calib_data.pt \
+    --save_path calib_data.pt \
+    --class_mode all \
+    --kmeans_n 6 \
+    --kmeans_m 7
+```
+To quantize, calibrate, and evaluate on Co3D:
+```bash
+python Quant_VGGT/vggt/evaluation/run_co3d.py \
+    --model_path Quant_VGGT/VGGT-1B/model_tracker_fixed_e20.pt \
+    --co3d_dir co3d_datasets/ \
+    --co3d_anno_dir co3d_v2_annotations/ \
+    --dtype quarot_w4a4 \
+    --seed 0 \
+    --lac \
+    --lwc \
+    --cache_path calib_data.pt \
+    --class_mode all \
+    --exp_name a44_uqant \
+    --resume_qs
+```
+## Citation
+If you find QuantVGGT useful for your work, please cite the following paper:
+```bibtex
+@article{feng2025quantized,
+  title={Quantized Visual Geometry Grounded Transformer},
+  author={Feng, Weilun and Qin, Haotong and Wu, Mingqiang and Yang, Chuanguang and Li, Yuqi and Li, Xiangqi and An, Zhulin and Huang, Libo and Zhang, Yulun and Magno, Michele and others},
+  journal={arXiv preprint arXiv:2509.21302},
+  year={2025}
+}
+```