Add pipeline tag, links and usage instructions
Browse filesThis PR improves the model card for OmniShotCut by:
- Adding the `video-classification` pipeline tag to the metadata.
- Including links to the official paper on ArXiv, the project website, and the GitHub repository.
- Providing installation and sample inference code snippets derived from the repository's documentation.
- Adding a BibTeX citation section.
README.md
CHANGED
|
@@ -1,15 +1,48 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
| 3 |
---
|
| 4 |
|
|
|
|
| 5 |
|
| 6 |
-
|
| 7 |
-
|
| 8 |
-
OmniShotCut is a sensitive and more informative SoTA on the Shot Boundary Detection. \
|
| 9 |
-
OmniShotCut can detect shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer.
|
| 10 |
-
|
| 11 |
|
| 12 |
[](https://arxiv.org/abs/2604.24762)
|
| 13 |
[](https://uva-computer-vision-lab.github.io/OmniShotCut_website/)
|
|
|
|
| 14 |
<a href="https://huggingface.co/spaces/uva-cv-lab/OmniShotCut"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20HF%20Space&message=Online+Demo&color=orange"></a>
|
| 15 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
pipeline_tag: video-classification
|
| 4 |
---
|
| 5 |
|
| 6 |
+
# OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer
|
| 7 |
|
| 8 |
+
OmniShotCut is a sensitive and informative state-of-the-art (SoTA) model for Shot Boundary Detection (SBD). It can detect shot changes in videos from diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.) and recognize Sudden Jumps and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer.
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
|
| 10 |
[](https://arxiv.org/abs/2604.24762)
|
| 11 |
[](https://uva-computer-vision-lab.github.io/OmniShotCut_website/)
|
| 12 |
+
[](https://github.com/UVA-Computer-Vision-Lab/OmniShotCut)
|
| 13 |
<a href="https://huggingface.co/spaces/uva-cv-lab/OmniShotCut"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20HF%20Space&message=Online+Demo&color=orange"></a>
|
| 14 |
|
| 15 |
+
## Installation 🔧
|
| 16 |
+
```shell
|
| 17 |
+
conda create -n OmniShotCut python=3.10
|
| 18 |
+
conda activate OmniShotCut
|
| 19 |
+
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 --index-url https://download.pytorch.org/whl/cu124
|
| 20 |
+
pip install -r requirements.txt
|
| 21 |
+
conda install ffmpeg
|
| 22 |
+
```
|
| 23 |
+
|
| 24 |
+
First, download the checkpoint:
|
| 25 |
+
```shell
|
| 26 |
+
mkdir checkpoints
|
| 27 |
+
cd checkpoints
|
| 28 |
+
wget https://huggingface.co/uva-cv-lab/OmniShotCut/resolve/main/OmniShotCut_ckpt.pth
|
| 29 |
+
```
|
| 30 |
+
|
| 31 |
+
## Inference ⚡
|
| 32 |
+
We provide several modes for inference. The `clean_shot` mode is recommended for users who want the most direct results (valid shots without any transitions and sudden jumps).
|
| 33 |
+
|
| 34 |
+
Execute the inference by running:
|
| 35 |
+
```shell
|
| 36 |
+
python test_code/inference.py --checkpoint_path checkpoints/OmniShotCut_ckpt.pth --input_video_path path/to/your/video.mp4 --mode clean_shot
|
| 37 |
+
```
|
| 38 |
+
The results will be visualized in a folder named `demo_video_results`, where vertical bars with the same color refer to the same shot.
|
| 39 |
+
|
| 40 |
+
## Citation
|
| 41 |
+
```bibtex
|
| 42 |
+
@article{wang2026omnishotcut,
|
| 43 |
+
title={OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer},
|
| 44 |
+
author={Wang, Boyang and Xu, Guangyi and Tang, Zhipeng and Zhang, Jiahui and Cheng, Zezhou},
|
| 45 |
+
journal={arXiv preprint arXiv:2604.24762},
|
| 46 |
+
year={2026}
|
| 47 |
+
}
|
| 48 |
+
```
|