Instructions to use 3rd-Degree-Burn/modernbert-stylefaith-rm-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use 3rd-Degree-Burn/modernbert-stylefaith-rm-v2 with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForMaskedLM tokenizer = AutoTokenizer.from_pretrained("3rd-Degree-Burn/modernbert-stylefaith-rm-v2") model = AutoModelForMaskedLM.from_pretrained("3rd-Degree-Burn/modernbert-stylefaith-rm-v2") - Notebooks
- Google Colab
- Kaggle
OracleRM ModernBERT Base v2
OracleRM ModernBERT Base v2 is a lightweight reward model for ranking written text. It scores a candidate using two heads:
style: how strongly the response matches the target literary/stylistic preference.faith: how well the response preserves the meaning of the source prompt.
The model is built on top of answerdotai/ModernBERT-base with two scalar classification heads. It is intended for reranking multiple candidate rewrites, not for text generation.
Files expected in this repo
This repo should contain the exported files from the training zip:
config.json
metadata.json
model_state_dict.pt
tokenizer.json / tokenizer files
special_tokens_map.json
tokenizer_config.json
vocab or tokenizer model files, depending on tokenizer export
The model is not saved as a standard AutoModelForSequenceClassification checkpoint. Load it by reconstructing the wrapper class, then loading model_state_dict.pt.
Input format
The model was trained with this text format:
### Source:
{prompt}
### Rewrite:
{response}
Use an empty source when scoring style only:
### Source:
### Rewrite:
{response}
Raw scores
The model returns two sigmoid scores:
style = sigmoid(style_logit)
faith = sigmoid(faith_logit)
A simple score is:
score = style * faith
For practical reranking, a softer faith-weighted score often works better:
score = style * (faith ** 0.65)
You may also apply a control penalty for candidates that are too short, too verbose, or unnecessarily hard to read.
Example usage
import json
from pathlib import Path
import torch
import torch.nn as nn
from huggingface_hub import snapshot_download
from transformers import AutoConfig, AutoTokenizer, AutoModel
HF_REPO_ID = "YOUR_USERNAME/oracle-rm-modernbert-base-v2"
local_dir = Path(snapshot_download(HF_REPO_ID))
device = "cuda" if torch.cuda.is_available() else "cpu"
metadata = json.loads((local_dir / "metadata.json").read_text())
MAX_LENGTH = int(metadata.get("max_length", 1024))
tokenizer = AutoTokenizer.from_pretrained(local_dir)
config = AutoConfig.from_pretrained(local_dir)
class TwoHeadRM(nn.Module):
def __init__(self, config):
super().__init__()
try:
self.backbone = AutoModel.from_config(config, attn_implementation="sdpa")
except TypeError:
self.backbone = AutoModel.from_config(config)
hidden = self.backbone.config.hidden_size
self.dropout = nn.Dropout(0.1)
self.style_head = nn.Linear(hidden, 1)
self.faith_head = nn.Linear(hidden, 1)
def forward(self, input_ids, attention_mask):
out = self.backbone(input_ids=input_ids, attention_mask=attention_mask)
pooled = out.last_hidden_state[:, 0]
pooled = self.dropout(pooled)
style_logit = self.style_head(pooled).squeeze(-1)
faith_logit = self.faith_head(pooled).squeeze(-1)
return style_logit, faith_logit
def safe_torch_load(path):
try:
return torch.load(path, map_location="cpu", weights_only=True)
except TypeError:
return torch.load(path, map_location="cpu")
model = TwoHeadRM(config)
model.load_state_dict(safe_torch_load(local_dir / "model_state_dict.pt"), strict=True)
model.to(device)
model.eval()
def format_input(prompt, response):
return f"### Source:\n{prompt}\n\n### Rewrite:\n{response}"
@torch.inference_mode()
def score_one(prompt, response):
text = format_input(prompt, response)
enc = tokenizer(
text,
max_length=MAX_LENGTH,
padding=True,
truncation=True,
return_tensors="pt",
).to(device)
style_logit, faith_logit = model(enc["input_ids"], enc["attention_mask"])
style = torch.sigmoid(style_logit.float()).item()
faith = torch.sigmoid(faith_logit.float()).item()
return {
"style": style,
"faith": faith,
"score": style * (faith ** 0.65),
}
prompt = "The room was silent and time seemed to move slowly."
candidate = "In that room, time sank to the bottom like sediment."
print(score_one(prompt, candidate))
Intended use
Use this model to:
- rerank candidate rewrites;
- select more literary or stylistically strong generations;
- filter outputs that drift too far from the source meaning;
- compare rewrite candidates during dataset curation.
This model is best used as a reranker after a generator has already produced several candidates.
Not intended for
This model should not be used as a general factuality verifier, safety classifier, plagiarism detector, or universal writing-quality metric. It reflects the style preferences and contrastive examples used during training.
Scoring guidance
For rewrite ranking, sort candidates by:
adjusted_score = style * (faith ** 0.65) * control_penalty
Where control_penalty may include:
- prompt-relative length limits;
- verbosity penalty only outside the allowed length band;
- readability penalty for overly dense or hard-to-read prose.
For style-only ranking, use:
style
For meaning preservation checks, inspect:
faith
Do not rely only on the final composite score. For debugging, always print style, faith, and the final adjusted score separately.
Limitations
- The model may over-reward ornate or elaborate prose if verbosity is not controlled.
- Very short inputs may produce unstable style judgments.
- Faithfulness is a learned similarity/preference signal, not a factual guarantee.
- The model was trained for English prose-style rewriting and may not transfer well to other languages or technical domains.
- Scores are relative and are most useful when comparing multiple candidates for the same prompt.
Suggested implementation pattern
Generate several rewrite candidates, score each candidate with this RM, then choose the highest adjusted score.
candidates = [
"The room was quiet and time felt slow.",
"In that room, time sank to the bottom like sediment.",
]
rows = []
for response in candidates:
scores = score_one(prompt, response)
rows.append((scores["score"], scores["style"], scores["faith"], response))
rows.sort(reverse=True)
for score, style, faith, response in rows:
print(f"score={score:.3f} style={style:.3f} faith={faith:.3f} | {response}")
- Downloads last month
- 63
Model tree for 3rd-Degree-Burn/modernbert-stylefaith-rm-v2
Base model
answerdotai/ModernBERT-base