π Penumbra β Uncertainty Maps for Language Models
Between knowing and guessing, there's a shadow. Penumbra makes it visible.
Penumbra is a finetuned version of Ministral-8B-Instruct-2410 trained to produce structured uncertainty maps alongside every answer.
Unlike standard models that output answers with uniform confident tone, Penumbra maps exactly which claims are solid, contested, or genuinely uncertain.
π― What It Does
Standard model:
"The 2008 financial crisis was caused by subprime mortgage lending, deregulation, and CDOs..."
Everything sounds equally certain.
Penumbra:
{
"answer": "The 2008 financial crisis was caused by...",
"claims": [
{
"claim": "Subprime mortgage lending was a primary trigger",
"confidence": 0.95,
"basis": "Overwhelming documented evidence, broad consensus",
"evidence_quality": "strong",
"alternative_views": null
},
{
"claim": "Deregulation was a direct cause",
"confidence": 0.61,
"basis": "Contested β economists debate causality vs correlation",
"evidence_quality": "moderate",
"alternative_views": "Some argue deregulation enabled but did not cause the crisis"
}
],
"overall_confidence": 0.78,
"least_certain_claim": "Deregulation was a direct cause",
"epistemic_summary": "Core facts are well-established; causal attributions remain debated."
}
Now you know exactly what to trust and what to verify.
π Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
import json
# Load base + adapter
base_model = AutoModelForCausalLM.from_pretrained(
"mistralai/Ministral-8B-Instruct-2410",
torch_dtype=torch.bfloat16,
device_map="auto"
)
model = PeftModel.from_pretrained(base_model, "Vaaruni2797/penumbra-ministral-8b")
tokenizer = AutoTokenizer.from_pretrained("Vaaruni2797/penumbra-ministral-8b")
# System prompt β required for JSON output
system = """You are an epistemically transparent AI assistant.
When answering questions, respond ONLY with valid JSON in exactly this format:
{
"answer": "complete natural language answer",
"claims": [
{
"claim": "specific assertion",
"confidence": 0.85,
"basis": "why this confidence level",
"evidence_quality": "strong",
"alternative_views": null
}
],
"overall_confidence": 0.85,
"least_certain_claim": "lowest confidence claim",
"epistemic_summary": "one sentence about overall certainty"
}"""
question = "Is nuclear energy safe?"
prompt = f"<s>[INST] <<SYS>>\n{system}\n<</SYS>>\n\n{question} [/INST]"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.1, do_sample=True)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
uncertainty_map = json.loads(response)
print(json.dumps(uncertainty_map, indent=2))
π§ Training Details
| Component | Detail |
|---|---|
| Base model | mistralai/Ministral-8B-Instruct-2410 |
| Method | QLoRA (r=4, alpha=8) |
| Annotator | Mistral Large 3 |
| Training data | TruthfulQA + TriviaQA + FEVER + Synthetic |
| Epochs | 1 |
| Hardware | RTX 3070 8GB |
| Token accuracy | 82.8% |
π Live Demo
Try it at: HuggingFace Spaces β Penumbra
Built for the Mistral Hackathon 2026. π
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for Vaaruni2797/penumbra-ministral-8b
Base model
mistralai/Ministral-8B-Instruct-2410