mwxely commited on
Commit
68e7dc7
·
1 Parent(s): 58a9fe1

card: align H1 with the paper title; drop internal 'Plan B' jargon

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -19,7 +19,7 @@ tags:
19
  - multimodal
20
  ---
21
 
22
- # ParaVT: From Format Fragility to Parallel Tool Mastery in Agentic Video RL
23
 
24
  <div align="center">
25
 
@@ -45,7 +45,7 @@ This repository hosts the final post-RL checkpoint (`ParaVT-8B`), obtained by ru
45
  | Architecture | `Qwen3VLForConditionalGeneration` |
46
  | Parameters | 8 B |
47
  | Base model | `Qwen/Qwen3-VL-8B-Instruct` |
48
- | Training stages | SFT (Plan B, 500 steps) → PARA-GRPO (54 steps) |
49
  | Training data | [`ParaVT/ParaVT-Parquet`](https://huggingface.co/datasets/ParaVT/ParaVT-Parquet) (`sft` + `rl` configs) |
50
  | Source videos | [`ParaVT/ParaVT-Source`](https://huggingface.co/datasets/ParaVT/ParaVT-Source) |
51
  | Native tool | Temporal cropping (start time, end time, optional sub-frame count) |
 
19
  - multimodal
20
  ---
21
 
22
+ # ParaVT: Taming the Tool Prior Paradox for Parallel Tool Use in Agentic Video Reinforcement Learning
23
 
24
  <div align="center">
25
 
 
45
  | Architecture | `Qwen3VLForConditionalGeneration` |
46
  | Parameters | 8 B |
47
  | Base model | `Qwen/Qwen3-VL-8B-Instruct` |
48
+ | Training stages | Cold-start SFT (500 steps) → PARA-GRPO RL (54 steps) |
49
  | Training data | [`ParaVT/ParaVT-Parquet`](https://huggingface.co/datasets/ParaVT/ParaVT-Parquet) (`sft` + `rl` configs) |
50
  | Source videos | [`ParaVT/ParaVT-Source`](https://huggingface.co/datasets/ParaVT/ParaVT-Source) |
51
  | Native tool | Temporal cropping (start time, end time, optional sub-frame count) |