๐ŸŽฌ VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority

Code HF Model ICML 2026

๐Ÿค— HuggingFace model: CewEhao/VideoSEAL_8B  ยท  ๐Ÿ’ป Code: Echochef/VideoSEAL

๐Ÿ‘‰ Introduction

This is the official model card for VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority (ICML 2026).

VideoSEAL provides offline build utilities for long video indexing:

  • OCR subtitles (SRT) โ†’ OCR captions + (optional) embeddings
  • Clip captions (VLM) โ†’ clip captions + (optional) embeddings
  • Merge into a unified semantic index under indexes/semantic/<video_id>/
  • (Optional) generate a global full_story.txt summary

๐Ÿ“ฆ Layout

  • ๐Ÿงฐ Shell entrypoints: scripts/
  • ๐Ÿ Python package: videoseal/
  • โœ… Tests: test/
  • ๐Ÿงฉ OCR toolchain (vendored): third_party/video-subtitle-extractor/

โš™๏ธ Configuration

  • Defaults live in the scripts under scripts/.
  • Put real API keys/endpoints in your shell environment / job launcher.

๐Ÿ—๏ธ Run offline build

cd /path/to/VideoSEAL

export MLLM_API_KEY="sk_your_api_key"
export EMBEDDING_API_KEY="sk_your_api_key"
export AGENT_LLM_API_KEY="sk_your_api_key"
export VISUAL_INSPECT_API_KEY="sk_your_api_key"
VIDEO=/path/to/video.mp4 BENCHMARK=LVBench ./scripts/run_offline_build.sh

โœ… Run tests

/root/miniconda3/envs/rllm/bin/python -m unittest discover -s test -v

๐Ÿ‹๏ธ GRPO training (video tool workflow)

This repo vendors a minimal copy of the rllm/ + verl/ Python packages (under the repo root) to make the video tool-agent GRPO workflow runnable without an extra repo checkout.

๐Ÿงช Training environment (conda)

conda create -n videoseal python=3.12 -y
conda activate videoseal

pip install vllm==0.11.0

cd rllm
pip install -e .

cd ../verl
pip install -e .

๐Ÿš€ Launcher

  • scripts/train/run_video_workflow_grpo.sh

๐Ÿงฉ Example

cd /path/to/VideoSEAL

# Export real API keys/endpoints in your environment before launching.

TRAIN_PARQUET='["/path/to/train.parquet"]' \
VAL_PARQUET='/path/to/val.parquet' \
MODEL_PATH='Qwen/Qwen3-8B' \
./scripts/train/run_video_workflow_grpo.sh train

๐Ÿ”Ž Quick checks

./scripts/train/run_video_workflow_grpo.sh test-reward
pytest -q tests/rewards/test_video_reward_tool_env_integration.py
Downloads last month
46
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for CewEhao/VideoSEAL_8B

Finetuned
Qwen/Qwen3-8B
Finetuned
(1583)
this model