File size: 2,183 Bytes
5a740cb
5ab6a67
 
 
 
5a740cb
74eb927
 
5a740cb
 
5ab6a67
 
 
 
 
 
 
 
5a740cb
 
5ab6a67
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
---
title: Bio Over-Refusal Explorer
emoji: 🧬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.9.1
python_version: "3.11"
app_file: app.py
pinned: false
license: cc-by-nc-sa-4.0
short_description: Browse 201 expert-annotated biology queries + 9-model FPR
tags:
  - ai-safety
  - biosafety
  - llm-evaluation
  - over-refusal
  - calibration
---

# Bio Over-Refusal Explorer

Static data browser for the [Bio Over-Refusal Dataset v0.1.0](https://huggingface.co/datasets/jang1563/bio-overrefusal-v0.1) β€” 201 domain-expert-authored biology research queries stratified by sensitivity tier, with 9-model false-positive refusal rates and Wilson 95% confidence intervals.

**No model API calls happen at runtime.** This Space loads pre-computed evaluation results from the dataset and lets you browse them by tier, subdomain, and legitimacy. Provider names are reported as observed; numbers should be read as a slice-level calibration signal for this specific biology-research benchmark, not as a global model-quality ranking.

## What you can do here

1. **Browse queries** β€” Filter the 201 queries by tier (1–5), subdomain (10), and legitimacy. Click a row to see the full record (biological reasoning, legitimate contexts, citations, danger-shift contexts).
2. **Compare models** β€” See the 9-model FPR table with Wilson 95% CIs. Switch between strict and broad FPR.
3. **Per-tier breakdown** β€” See how each model's FPR varies across the 5 sensitivity tiers.

## Source artifacts

- πŸ“Š Dataset: [jang1563/bio-overrefusal-v0.1](https://huggingface.co/datasets/jang1563/bio-overrefusal-v0.1)
- πŸ’» Code + reproducibility: [github.com/jang1563/bio-overrefusal-v0.1](https://github.com/jang1563/bio-overrefusal-v0.1)
- πŸ“‹ Safety scope: [SAFETY.md](https://github.com/jang1563/bio-overrefusal-v0.1/blob/main/SAFETY.md)

## Position in the safety stack

This dataset is a **calibration measurement**, not a deployed mitigation. It complements rather than replaces capability evaluations (e.g. WMDP, biothreat-eval), constitutional/classifier safeguards, and red-team work. This work is independent and does not represent any provider's internal evaluation pipeline.