wrathofgod
/

movie-scene-module0-v3

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Module 0 Scene Encoder (v3 — Supervised Contrastive)

Joint pretraining for movie-scene understanding with:

supervised contrastive learning with label-key positives (no heuristic noise) and full-dataset label-based negatives (no 200-sample window)
full-batch SupCon so every in-batch same-key scene contributes as a positive
masked language modelling on annotation scene_text
auxiliary scene-label prediction (SmoothL1 regression, label-smoothed classification)

Backbone

base_model: microsoft/deberta-v3-base
embedding_dim: 256

Dataset Summary

all_scenes: 11118
train_scenes: 9593
val_scenes: 1525
all_films: 80
train_positive_avg_candidates: 6.0552
val_positive_avg_candidates: 3.2643
mlm_chunks: 15118

Best Validation Metrics

val_loss: 4.9684
val_contrastive_loss: 4.4305
val_aux_loss: 0.8275
val_r_at_1: 0.0571
val_temperature: 0.1671
avg_cls_acc: 0.4942
avg_reg_mae: 0.9264
avg_bin_acc: 0.7480
val_task/emotional_valence: 1.1910
val_task/scene_interaction_tone: 1.3215
val_task/conflict_nature: 1.3526
val_task/acoustic_space: 1.5992
val_task/reality_layer: 0.3977
val_task/score_dynamic_shape: 1.2102
val_task/narrative_arc_position: 1.1923
val_task/foreshadowing_type: 1.2925
val_task/transition_type: 1.3077
val_task/scene_tension_raw: 0.5136
val_task/scene_arousal: 0.4495
val_task/scene_valence_continuous: 0.4300
val_task/pacing_intensity: 0.4838
val_task/action_intensity: 0.6442
val_task/emotional_shift_trigger: 0.5622
val_task/emotion_tags: 0.4309

Intended Use

Use this encoder as the backbone for Module 1 by setting:

CFG["backbone"] = "/kaggle/working/module0_backbone"

Downloads last month: 33

Safetensors

Model size

0.2B params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support