stdKonjac commited on
Commit
a4015bd
·
verified ·
1 Parent(s): f4afc88

update arXiv link

Browse files
Files changed (1) hide show
  1. README.md +23 -7
README.md CHANGED
@@ -6,7 +6,7 @@ license: cc-by-4.0
6
 
7
  [Ziyun Zeng](https://stdkonjac.icu/), Yiqi Lin, [Guoqiang Liang](https://ethanliang99.github.io/), and [Mike Zheng Shou](https://cde.nus.edu.sg/ece/staff/shou-zheng-mike/)
8
 
9
- [![arXiv](https://img.shields.io/badge/arXiv-XXX-b31b1b?logo=arxiv&logoColor=white)](https://arxiv.org/abs/XXX)
10
  [![Project Page](https://img.shields.io/badge/Website-Project%20Page-green?logo=googlechrome&logoColor=white)](https://showlab.github.io/Sparkle/)
11
  [![Code](https://img.shields.io/badge/Code-GitHub%20Repo-blue?logo=github)](https://github.com/showlab/Sparkle)
12
  [![Dataset](https://img.shields.io/badge/🤗%20Dataset-Sparkle-orange.svg)](https://huggingface.co/datasets/stdKonjac/Sparkle)
@@ -16,7 +16,7 @@ license: cc-by-4.0
16
 
17
  ## 📦 Dataset
18
 
19
- **Sparkle** is a large-scale video background replacement dataset comprising ~140K high-quality source–edited video pairs. It is fully open-sourced at [🤗stdKonjac/Sparkle](https://huggingface.co/datasets/stdKonjac/Sparkle). For full methodology and dataset details, please refer to [our paper](https://arxiv.org/abs/XXX).
20
 
21
  The dataset is organized into **five themes** along different background-change axes:
22
 
@@ -153,7 +153,7 @@ After extraction, the directory layout matches the online preview structure, and
153
  <details>
154
  <summary><h3 style="display: inline">🧪 Pipeline Intermediates</h3></summary>
155
 
156
- To support **full reproducibility, transparency, and downstream research**, we additionally release every intermediate artifact produced by the 5-stage Sparkle data pipeline (see *Figure 2: Data Pipeline* in [our paper](https://arxiv.org/abs/XXX)) under `intermediate_data/`. **The first 100 samples of every theme are uncompressed and previewable directly in the browser**, mirroring the layout of the `{edit_type}/` preview folders described above.
157
 
158
  Taking `Sparkle_location_000000` as a running example, the artifact layout looks like:
159
 
@@ -406,7 +406,7 @@ Source videos in the `openve3m` theme are derived from [OpenVE-3M](https://arxiv
406
 
407
  ## 🎯 Benchmark
408
 
409
- **Sparkle-Bench** is the largest evaluation benchmark tailored for instruction-guided video background replacement, comprising **458 carefully curated videos across 4 themes, 21 subthemes, and 97 distinct scenes**. It is fully open-sourced at [🤗stdKonjac/Sparkle-Bench](https://huggingface.co/datasets/stdKonjac/Sparkle-Bench). For evaluation methodology and our six-dimensional scoring protocol, please refer to [our paper](https://arxiv.org/abs/XXX).
410
 
411
  **All source videos in the benchmark are uncompressed and previewable directly in the browser**, so users can inspect any sample without downloading anything.
412
 
@@ -509,7 +509,7 @@ After downloading, the relative paths in `{edit_type}_bench.csv` (e.g. `source_v
509
 
510
  ### 📊 Evaluation
511
 
512
- We provide an end-to-end evaluation script, [`eval_sparkle_bench_gemini.py`](https://github.com/showlab/Sparkle/blob/main/eval_sparkle_bench_gemini.py), that scores edited videos using Gemini-2.5-Pro under our six-dimensional rubric (see *Section 3.7* in [our paper](https://arxiv.org/abs/XXX)). The six dimensions are: **Instruction Compliance**, **Overall Visual Quality**, **Foreground Integrity**, **Foreground Motion Consistency**, **Background Dynamics**, and **Background Visual Quality**, each scored on a 1–5 scale.
513
 
514
  #### 1. Prepare your inference outputs
515
 
@@ -603,7 +603,7 @@ After scoring, the script aggregates per-theme and macro averages and prints a s
603
 
604
  ### 🖼️ Reference Images (Optional, Use with Caution)
605
 
606
- By construction, every Sparkle-Bench sample is a video that **passed the first four stages of our pipeline but failed the final synthesis quality check in Stage 5** (see Section 3.7 of [our paper](https://arxiv.org/abs/XXX)). As a free byproduct, this means each sample comes with a **pure background image** generated by Stage 3 (Individual Background Generation), where the foreground has been removed from the preliminarily edited first frame.
607
 
608
  We release these images under `ref_images/{edit_type}/{id}.png`, alongside the CSV/JSONL annotations. These images may be useful for **reference-based** background-replacement experiments (e.g., feeding the clean background as an extra visual condition to the editing model).
609
 
@@ -695,4 +695,20 @@ Kiwi-Sparkle is released under the [Creative Commons Attribution 4.0 Internation
695
 
696
  ## 🙏 Acknowledgements
697
 
698
- This project is built on top of a number of excellent open-source projects. We thank the authors of [Kiwi-Edit](https://github.com/showlab/Kiwi-Edit), [FLUX.2-klein-9B](https://huggingface.co/black-forest-labs/FLUX.2-klein-9B), [Qwen3-VL-32B](https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct), [Wan2.2-I2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B), [LightX2V](https://github.com/ModelTC/lightx2v), and [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun) for releasing the infrastructure that made this work possible.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
 
7
  [Ziyun Zeng](https://stdkonjac.icu/), Yiqi Lin, [Guoqiang Liang](https://ethanliang99.github.io/), and [Mike Zheng Shou](https://cde.nus.edu.sg/ece/staff/shou-zheng-mike/)
8
 
9
+ [![arXiv](https://img.shields.io/badge/arXiv-Paper-b31b1b?logo=arxiv&logoColor=white)](https://arxiv.org/abs/2605.06535)
10
  [![Project Page](https://img.shields.io/badge/Website-Project%20Page-green?logo=googlechrome&logoColor=white)](https://showlab.github.io/Sparkle/)
11
  [![Code](https://img.shields.io/badge/Code-GitHub%20Repo-blue?logo=github)](https://github.com/showlab/Sparkle)
12
  [![Dataset](https://img.shields.io/badge/🤗%20Dataset-Sparkle-orange.svg)](https://huggingface.co/datasets/stdKonjac/Sparkle)
 
16
 
17
  ## 📦 Dataset
18
 
19
+ **Sparkle** is a large-scale video background replacement dataset comprising ~140K high-quality source–edited video pairs. It is fully open-sourced at [🤗stdKonjac/Sparkle](https://huggingface.co/datasets/stdKonjac/Sparkle). For full methodology and dataset details, please refer to [our paper](https://arxiv.org/abs/2605.06535).
20
 
21
  The dataset is organized into **five themes** along different background-change axes:
22
 
 
153
  <details>
154
  <summary><h3 style="display: inline">🧪 Pipeline Intermediates</h3></summary>
155
 
156
+ To support **full reproducibility, transparency, and downstream research**, we additionally release every intermediate artifact produced by the 5-stage Sparkle data pipeline (see *Figure 2: Data Pipeline* in [our paper](https://arxiv.org/abs/2605.06535)) under `intermediate_data/`. **The first 100 samples of every theme are uncompressed and previewable directly in the browser**, mirroring the layout of the `{edit_type}/` preview folders described above.
157
 
158
  Taking `Sparkle_location_000000` as a running example, the artifact layout looks like:
159
 
 
406
 
407
  ## 🎯 Benchmark
408
 
409
+ **Sparkle-Bench** is the largest evaluation benchmark tailored for instruction-guided video background replacement, comprising **458 carefully curated videos across 4 themes, 21 subthemes, and 97 distinct scenes**. It is fully open-sourced at [🤗stdKonjac/Sparkle-Bench](https://huggingface.co/datasets/stdKonjac/Sparkle-Bench). For evaluation methodology and our six-dimensional scoring protocol, please refer to [our paper](https://arxiv.org/abs/2605.06535).
410
 
411
  **All source videos in the benchmark are uncompressed and previewable directly in the browser**, so users can inspect any sample without downloading anything.
412
 
 
509
 
510
  ### 📊 Evaluation
511
 
512
+ We provide an end-to-end evaluation script, [`eval_sparkle_bench_gemini.py`](https://github.com/showlab/Sparkle/blob/main/eval_sparkle_bench_gemini.py), that scores edited videos using Gemini-2.5-Pro under our six-dimensional rubric (see *Section 3.7* in [our paper](https://arxiv.org/abs/2605.06535)). The six dimensions are: **Instruction Compliance**, **Overall Visual Quality**, **Foreground Integrity**, **Foreground Motion Consistency**, **Background Dynamics**, and **Background Visual Quality**, each scored on a 1–5 scale.
513
 
514
  #### 1. Prepare your inference outputs
515
 
 
603
 
604
  ### 🖼️ Reference Images (Optional, Use with Caution)
605
 
606
+ By construction, every Sparkle-Bench sample is a video that **passed the first four stages of our pipeline but failed the final synthesis quality check in Stage 5** (see Section 3.7 of [our paper](https://arxiv.org/abs/2605.06535)). As a free byproduct, this means each sample comes with a **pure background image** generated by Stage 3 (Individual Background Generation), where the foreground has been removed from the preliminarily edited first frame.
607
 
608
  We release these images under `ref_images/{edit_type}/{id}.png`, alongside the CSV/JSONL annotations. These images may be useful for **reference-based** background-replacement experiments (e.g., feeding the clean background as an extra visual condition to the editing model).
609
 
 
695
 
696
  ## 🙏 Acknowledgements
697
 
698
+ This project is built on top of a number of excellent open-source projects. We thank the authors of [Kiwi-Edit](https://github.com/showlab/Kiwi-Edit), [FLUX.2-klein-9B](https://huggingface.co/black-forest-labs/FLUX.2-klein-9B), [Qwen3-VL-32B](https://huggingface.co/Qwen/Qwen3-VL-32B-Instruct), [Wan2.2-I2V-A14B](https://huggingface.co/Wan-AI/Wan2.2-I2V-A14B), [LightX2V](https://github.com/ModelTC/lightx2v), and [VideoX-Fun](https://github.com/aigc-apps/VideoX-Fun) for releasing the infrastructure that made this work possible.
699
+
700
+ ## 📝 Citation
701
+
702
+ If you find Sparkle useful for your research, please consider citing our paper:
703
+
704
+ ```bibtex
705
+ @misc{zeng2026sparkle,
706
+ title = {Sparkle: Realizing Lively Instruction-Guided Video Background Replacement via Decoupled Guidance},
707
+ author = {Zeng, Ziyun and Lin, Yiqi and Liang, Guoqiang and Shou, Mike Zheng},
708
+ year = {2026},
709
+ eprint = {2605.06535},
710
+ archivePrefix = {arXiv},
711
+ primaryClass = {cs.CV},
712
+ url = {https://arxiv.org/abs/2605.06535}
713
+ }
714
+ ```