πŸŒ’ Penumbra β€” Uncertainty Maps for Language Models

Between knowing and guessing, there's a shadow. Penumbra makes it visible.

Penumbra is a finetuned version of Ministral-8B-Instruct-2410 trained to produce structured uncertainty maps alongside every answer.

Unlike standard models that output answers with uniform confident tone, Penumbra maps exactly which claims are solid, contested, or genuinely uncertain.


🎯 What It Does

Standard model:

"The 2008 financial crisis was caused by subprime mortgage lending, deregulation, and CDOs..."

Everything sounds equally certain.

Penumbra:

{
  "answer": "The 2008 financial crisis was caused by...",
  "claims": [
    {
      "claim": "Subprime mortgage lending was a primary trigger",
      "confidence": 0.95,
      "basis": "Overwhelming documented evidence, broad consensus",
      "evidence_quality": "strong",
      "alternative_views": null
    },
    {
      "claim": "Deregulation was a direct cause",
      "confidence": 0.61,
      "basis": "Contested β€” economists debate causality vs correlation",
      "evidence_quality": "moderate",
      "alternative_views": "Some argue deregulation enabled but did not cause the crisis"
    }
  ],
  "overall_confidence": 0.78,
  "least_certain_claim": "Deregulation was a direct cause",
  "epistemic_summary": "Core facts are well-established; causal attributions remain debated."
}

Now you know exactly what to trust and what to verify.


πŸš€ Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import json

# Load base + adapter
base_model = AutoModelForCausalLM.from_pretrained(
    "mistralai/Ministral-8B-Instruct-2410",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "Vaaruni2797/penumbra-ministral-8b")
tokenizer = AutoTokenizer.from_pretrained("Vaaruni2797/penumbra-ministral-8b")

# System prompt β€” required for JSON output
system = """You are an epistemically transparent AI assistant.
When answering questions, respond ONLY with valid JSON in exactly this format:
{
  "answer": "complete natural language answer",
  "claims": [
    {
      "claim": "specific assertion",
      "confidence": 0.85,
      "basis": "why this confidence level",
      "evidence_quality": "strong",
      "alternative_views": null
    }
  ],
  "overall_confidence": 0.85,
  "least_certain_claim": "lowest confidence claim",
  "epistemic_summary": "one sentence about overall certainty"
}"""

question = "Is nuclear energy safe?"
prompt = f"<s>[INST] <<SYS>>\n{system}\n<</SYS>>\n\n{question} [/INST]"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)

uncertainty_map = json.loads(response)
print(json.dumps(uncertainty_map, indent=2))

🧠 Training Details

Component Detail
Base model mistralai/Ministral-8B-Instruct-2410
Method QLoRA (r=4, alpha=8)
Annotator Mistral Large 3
Training data TruthfulQA + TriviaQA + FEVER + Synthetic
Epochs 1
Hardware RTX 3070 8GB
Token accuracy 82.8%

πŸ“Š Live Demo

Try it at: HuggingFace Spaces β€” Penumbra


Built for the Mistral Hackathon 2026. πŸŒ’

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Vaaruni2797/penumbra-ministral-8b

Finetuned
(97)
this model