YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Module 0 Scene Encoder (v3 โ€” Supervised Contrastive)

Joint pretraining for movie-scene understanding with:

  • supervised contrastive learning with label-key positives (no heuristic noise) and full-dataset label-based negatives (no 200-sample window)
  • full-batch SupCon so every in-batch same-key scene contributes as a positive
  • masked language modelling on annotation scene_text
  • auxiliary scene-label prediction (SmoothL1 regression, label-smoothed classification)

Backbone

  • base_model: microsoft/deberta-v3-base
  • embedding_dim: 256

Dataset Summary

  • all_scenes: 11118
  • train_scenes: 9593
  • val_scenes: 1525
  • all_films: 80
  • train_positive_avg_candidates: 6.0552
  • val_positive_avg_candidates: 3.2643
  • mlm_chunks: 15118

Best Validation Metrics

  • val_loss: 4.9684
  • val_contrastive_loss: 4.4305
  • val_aux_loss: 0.8275
  • val_r_at_1: 0.0571
  • val_temperature: 0.1671
  • avg_cls_acc: 0.4942
  • avg_reg_mae: 0.9264
  • avg_bin_acc: 0.7480
  • val_task/emotional_valence: 1.1910
  • val_task/scene_interaction_tone: 1.3215
  • val_task/conflict_nature: 1.3526
  • val_task/acoustic_space: 1.5992
  • val_task/reality_layer: 0.3977
  • val_task/score_dynamic_shape: 1.2102
  • val_task/narrative_arc_position: 1.1923
  • val_task/foreshadowing_type: 1.2925
  • val_task/transition_type: 1.3077
  • val_task/scene_tension_raw: 0.5136
  • val_task/scene_arousal: 0.4495
  • val_task/scene_valence_continuous: 0.4300
  • val_task/pacing_intensity: 0.4838
  • val_task/action_intensity: 0.6442
  • val_task/emotional_shift_trigger: 0.5622
  • val_task/emotion_tags: 0.4309

Intended Use

Use this encoder as the backbone for Module 1 by setting:

CFG["backbone"] = "/kaggle/working/module0_backbone"
Downloads last month
33
Safetensors
Model size
0.2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support