Add paper link and citation

This PR improves the model card by:
- Linking the model to its official research paper: [VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority](https://huggingface.co/papers/2605.12571).
- Adding a paper badge to the header for better visibility.
- Adding a BibTeX citation section to facilitate proper attribution for researchers.

Files changed (1) hide show

README.md +21 -3

README.md CHANGED Viewed

@@ -1,10 +1,10 @@
 ---
-license: apache-2.0
-library_name: transformers
-pipeline_tag: video-text-to-text
 base_model: Qwen/Qwen3-8B
 language:
 - en
 tags:
 - video-understanding
 - long-video-understanding
@@ -19,6 +19,7 @@ tags:
 <h2 align="center">🎬 VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority</h2>
 <p align="center">
   <a href="https://github.com/Echochef/VideoSEAL"><img alt="Code" src="https://img.shields.io/badge/Code-GitHub-black?logo=github"></a>
   <a href="https://huggingface.co/CewEhao/VideoSEAL_8B"><img alt="HF Model" src="https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-VideoSEAL__8B-yellow"></a>
   <img alt="ICML 2026" src="https://img.shields.io/badge/ICML-2026-blue">
@@ -30,12 +31,17 @@ tags:
   &nbsp;·&nbsp;
   💻 Code:
   <a href="https://github.com/Echochef/VideoSEAL">Echochef/VideoSEAL</a>
 </p>
 ## 👉 Introduction
 This is the official model card for **VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority** (ICML 2026).
 VideoSEAL provides offline build utilities for long video indexing:
 - OCR subtitles (SRT) → OCR captions + (optional) embeddings
@@ -116,3 +122,15 @@ MODEL_PATH='Qwen/Qwen3-8B' \
 ./scripts/train/run_video_workflow_grpo.sh test-reward
 pytest -q tests/rewards/test_video_reward_tool_env_integration.py
 ```

 ---
 base_model: Qwen/Qwen3-8B
 language:
 - en
+library_name: transformers
+license: apache-2.0
+pipeline_tag: video-text-to-text
 tags:
 - video-understanding
 - long-video-understanding
 <h2 align="center">🎬 VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority</h2>
 <p align="center">
+  <a href="https://huggingface.co/papers/2605.12571"><img alt="Paper" src="https://img.shields.io/badge/Paper-HF--Paper-red"></a>
   <a href="https://github.com/Echochef/VideoSEAL"><img alt="Code" src="https://img.shields.io/badge/Code-GitHub-black?logo=github"></a>
   <a href="https://huggingface.co/CewEhao/VideoSEAL_8B"><img alt="HF Model" src="https://img.shields.io/badge/%F0%9F%A4%97%20HuggingFace-VideoSEAL__8B-yellow"></a>
   <img alt="ICML 2026" src="https://img.shields.io/badge/ICML-2026-blue">
   &nbsp;·&nbsp;
   💻 Code:
   <a href="https://github.com/Echochef/VideoSEAL">Echochef/VideoSEAL</a>
+  &nbsp;·&nbsp;
+  📄 Paper:
+  <a href="https://huggingface.co/papers/2605.12571">2605.12571</a>
 </p>
 ## 👉 Introduction
 This is the official model card for **VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority** (ICML 2026).
+VideoSEAL is an agentic framework for long-video question answering. It separates the *planner* role (deciding which evidence to gather) from the *answerer* role (judging the evidence), mitigating the "evidence misalignment" where models produce correct answers not supported by retrieved evidence.
 VideoSEAL provides offline build utilities for long video indexing:
 - OCR subtitles (SRT) → OCR captions + (optional) embeddings
 ./scripts/train/run_video_workflow_grpo.sh test-reward
 pytest -q tests/rewards/test_video_reward_tool_env_integration.py
 ```
+## 📜 Citation
+```bibtex
+@inproceedings{videoseal2026,
+  title={VideoSEAL: Mitigating Evidence Misalignment in Agentic Long Video Understanding by Decoupling Answer Authority},
+  author={Dongyang Liu and others},
+  booktitle={International Conference on Machine Learning (ICML)},
+  year={2026},
+  url={https://huggingface.co/papers/2605.12571}
+}
+```