| scheme: subq_hint |
| description: |- |
| JSON-only per-task prompts with observable sub-questions/checklists. The |
| subq+human setting uses sub-question prompts as input and human scores as |
| training targets. |
| sub_questions: |
| source: static |
| answer_format: hint |
| system_prompt: You are a strict video evaluation model. |
| general_keys: |
| - SA |
| - PTV |
| - persistence |
| eval_prompts: |
| SA: |- |
| Evaluate Prompt Alignment (SA). |
| |
| Caption: |
| "{prompt}" |
|
|
| The video was generated using a text+image-to-video (ti2v) model, conditioned on the first frame and the text prompt above. |
|
|
| Sub-questions to consider in your mind before scoring: |
| {questions_block} |
|
|
| Score 1-5: |
| 5=fully aligned |
| 4=mostly aligned with minor deviations |
| 3=partially aligned with notable gaps |
| 2=mostly misaligned |
| 1=not aligned |
|
|
| Then output ONLY a JSON object with exactly one key: SA. |
|
|
| Example: |
| {{"SA": 3}} |
| PTV: |- |
| Evaluate Temporal Coherence (PTV). |
| |
| Caption: |
| "{prompt}" |
|
|
| The video was generated using a text+image-to-video (ti2v) model, conditioned on the first frame and the text prompt above. |
|
|
| Sub-questions to consider in your mind before scoring: |
| {questions_block} |
|
|
| Score 1-5: |
| 5=fully plausible event order |
| 4=mostly plausible with minor timing issues |
| 3=partially plausible |
| 2=mostly implausible |
| 1=completely implausible order |
|
|
| Then output ONLY a JSON object with exactly one key: PTV. |
|
|
| Example: |
| {{"PTV": 4}} |
| persistence: |- |
| Evaluate Object Persistence. |
| |
| Caption, for context only: |
| "{prompt}" |
|
|
| The video was generated using a text+image-to-video (ti2v) model, conditioned on the first frame and the text prompt above. |
|
|
| Sub-questions to consider in your mind before scoring: |
| {questions_block} |
|
|
| Score 1-5: |
| 5=fully consistent |
| 4=mostly consistent with minor flicker |
| 3=noticeable issues |
| 2=major inconsistencies |
| 1=severe disappearance or identity changes |
|
|
| Then output ONLY a JSON object with exactly one key: persistence. |
|
|
| Example: |
| {{"persistence": 4}} |
| physical_sub_questions: true |
| physical_template: |- |
| Evaluate physical realism for one physical law: {law}. |
| |
| Criterion: |
| {criteria} |
|
|
| Caption, for context only: |
| "{prompt}" |
|
|
| Sub-questions to consider in your mind before scoring: |
| {questions_block} |
|
|
| Judge the video itself. Do not penalize prompt mismatch unless it affects whether this physical law can be evaluated. |
|
|
| Score 1-5: |
| 5=clearly correct |
| 4=mostly correct with minor issues |
| 3=partially correct or ambiguous |
| 2=mostly incorrect |
| 1=severely incorrect |
|
|
| Then output ONLY a JSON object with exactly one key: {law}. |
|
|
| Example: |
| {{"{law}": 3}} |
|
|