Video-Text-to-Text
Transformers
Safetensors
English
qwen3_vl
image-text-to-text
video
long-video
reasoning
tool-calling
agentic-rl
grpo
multimodal
Instructions to use ParaVT/ParaVT-8B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ParaVT/ParaVT-8B with Transformers:
# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("ParaVT/ParaVT-8B") model = AutoModelForImageTextToText.from_pretrained("ParaVT/ParaVT-8B") - Notebooks
- Google Colab
- Kaggle
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
|
@@ -71,8 +71,8 @@ If you find ParaVT useful for your research and applications, please cite:
|
|
| 71 |
|
| 72 |
```bibtex
|
| 73 |
@misc{yang2026paravt,
|
| 74 |
-
title={{ParaVT}:
|
| 75 |
-
author={Zuhao Yang and
|
| 76 |
year={2026},
|
| 77 |
archivePrefix={arXiv},
|
| 78 |
primaryClass={cs.CV}
|
|
|
|
| 71 |
|
| 72 |
```bibtex
|
| 73 |
@misc{yang2026paravt,
|
| 74 |
+
title={{ParaVT}: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning},
|
| 75 |
+
author={Zuhao Yang and Sudong Wang and Kaichen Zhang and Keming Wu and Sicong Leng and Yifan Zhang and Bo Li and Chengwei Qin and Shijian Lu and Xingxuan Li and Lidong Bing},
|
| 76 |
year={2026},
|
| 77 |
archivePrefix={arXiv},
|
| 78 |
primaryClass={cs.CV}
|