You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

ROMAN β€” Scene Graph Node Relevance Classification

Given a 3D scene graph and a natural-language navigation constraint (e.g., "avoid the kitchen"), classify each node as relevant or non-relevant for cost-function generation used by a PRM path planner.

Models in This Repo

V12 GNN Models (SAGEConv / GCNConv)

Lightweight GNN models with enriched categorical features on train_ready_v4_dedup.jsonl.

Model Path Params Test F1 HO F1
SAGEConv v12/SAGE/best.pt 1.25M 0.924 0.949
GCNConv v12/GCN/best.pt 0.80M 0.921 0.946

Features (1262 dims): spatial(10) + node_type(3) + floor(4) + material(15) + affordance(78) + parent_room_emb(384) + label_emb(384) + tiled_instr(384)

V11-Token ModernBERT Models

ModernBERT-base backbone with per-token classification (Provence-style text serialization).

Model Path Trainable Params Test F1 HO F1
LoRA r=8 v11t/lora_r8/model_best.pt 1.7M 0.910 0.949
LoRA r=16 v11t/lora_r16/model_best.pt 3.4M 0.897 0.951
Full fine-tune v11t/full/model_best.pt 149.0M 0.907 0.958

Dataset

dataset/v4_dedup_enriched_1024/ β€” Pre-tokenized HF Dataset (ModernBERT tokenizer, enriched text mode, max_length=1024)

Source: train_ready_v4_dedup.jsonl β€” 7,911 records, 88 Matterport3D scenes, 783K nodes (7.6% relevant)

Quick Start

Download

from huggingface_hub import snapshot_download

# Download entire repo
snapshot_download("Catkamakura/roman-scene-graph", local_dir="roman-scene-graph")

# Or download specific files
from huggingface_hub import hf_hub_download

# V12 SAGE model only
hf_hub_download("Catkamakura/roman-scene-graph", "v12/SAGE/best.pt", local_dir=".")

# V11T full fine-tune only
hf_hub_download("Catkamakura/roman-scene-graph", "v11t/full/model_best.pt", local_dir=".")

V12 GNN Inference

import torch
import text_encoders
from SceneGraphDatasetV4E import SceneGraphDatasetV4E
from train_v12_sage_vs_gcn import SceneGraphSAGE, forward_with_tiled_instr

model = SceneGraphSAGE(in_channels=1262, hidden_channels_arr=[256]*3, out_channels=64)
model.load_state_dict(torch.load("v12/SAGE/best.pt", weights_only=True))
model.eval()

te = text_encoders.DictTextEncoder("embeddings/sentence-transformers/all-MiniLM-L6-v2_embeddings_False.pkl")
ds = SceneGraphDatasetV4E("your_input.jsonl", text_encoder=te, include_parent_room=True)
ds.encode_all_node_features()
ds.all_graphs_make_x()

with torch.no_grad():
    graph = ds[0]
    scores = forward_with_tiled_instr(model, graph, "cpu")
    probs = torch.sigmoid(scores)
    relevant = probs > 0.5

V11T ModernBERT Inference

import torch
from datasets import Dataset
from model_modernbert import SceneGraphModernBERT

model = SceneGraphModernBERT(backbone="answerdotai/ModernBERT-base", train_mode="full")
model.load_state_dict(torch.load("v11t/full/model_best.pt", weights_only=True))
model.eval()

dataset = Dataset.load_from_disk("dataset/v4_dedup_enriched_1024")

Training

# V12 GNN
python train_v12_sage_vs_gcn.py \
    --model both \
    --dataset_path training_data_v2/train_ready_v4_dedup.jsonl \
    --dictFile embeddings/sentence-transformers/all-MiniLM-L6-v2_embeddings_False.pkl

# V11T ModernBERT
./run_v11_token_v4.sh

Code Dependencies

File Purpose
SceneGraphDatasetV4E.py Enriched dataset loader (floor/material/affordance)
train_v12_sage_vs_gcn.py V12 training + SceneGraphSAGE model
model_v2.py SceneGraphGCNv2 model
model_modernbert.py SceneGraphModernBERT model
SceneGraphDatasetBERT.py BERT dataset loader
text_encoders/ DictTextEncoder for pre-computed embeddings
loss_v2.py Loss functions
embeddings/sentence-transformers/all-MiniLM-L6-v2_embeddings_False.pkl Pre-computed embeddings
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support