Scene Perception Module 1 (M0-embedding backbone)

Multi-task head network for single-scene film analysis. Consumes frozen DeBERTa-v2 embeddings from wrathofgod/movie-scene-module0 and produces 256-d scene embeddings consumed by Module 2.

Architecture change vs previous version

	Old M1	New M1
Backbone	distilroberta-base (trained)	DeBERTa-v2 from M0 (frozen)
Input	Raw scene text	Pre-extracted 768-d embeddings
Head depth	Single Linear layer	2-layer trunk + per-head MLP
Training speed	Slow (full backbone)	Fast (heads only)

11 Scene-Level Heads

#	Head	Type	Output
1	emotional_valence	4-class	Positive_Uplifting, Neutral_Complex, Tension_Action, Negative_Distressing
2	conflict_nature	6-class	Physical_Danger, Psychological_Tension, Interpersonal_Conflict, Moral_Dilemma, Environmental_Threat, Unknown_Threat
3	acoustic_space	6-class	Interior_Small, Interior_Large, Outdoor_Natural, Outdoor_Urban, Vehicle, Abstract
4	reality_layer	5-class	Present, Memory, Dream, Internal, Distorted
5	score_dynamic_shape	4-class	Build_Release, Sustained, Sudden_Drop, Flat
6	scene_interaction_tone	5-class	Conflict, Bonding, Expository, Negotiation, Reflective
7	pacing_intensity	regression	1–10
8	action_intensity	regression	0–10
9	scene_tension_raw	regression	1–10
10	scene_arousal	regression	0–1
11	emotion_tags	multi-label 7	Anger, Joy, Sadness, Fear, Disgust, Surprise, Neutral

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support