obliteratus / CHANGELOG.md
pliny-the-prompter's picture
Upload 130 files
ab1b6fe verified

A newer version of the Gradio SDK is available: 6.13.0

Upgrade

Changelog

All notable changes to OBLITERATUS are documented here. Format follows Keep a Changelog.

[0.1.2] - 2026-03-03

Fixed

  • Fixed spaces.GPU AttributeError crash on HuggingFace Spaces — fallback now catches both ImportError and AttributeError so the Space gracefully degrades to CPU mode when ZeroGPU is unavailable
  • Added missing hardware: zero-a10g to HF Space metadata (hf-spaces/README.md) — required for the spaces package to expose the @spaces.GPU decorator

Improved

  • Added mypy type checking to CI pipeline (continue-on-error while baseline is established)
  • Added mypy to dev dependencies
  • Version bump to 0.1.2 across pyproject.toml and __init__.py

[0.1.1] - 2026-03-01

Fixed

  • Fixed all broken imports (missing function exports in telemetry, evaluation, analysis modules)
  • Resolved all ruff lint errors across the codebase
  • Corrected GitHub org name in all documentation and configuration files
  • Updated test count in README to match actual collectible tests
  • Softened overclaim language in documentation and paper

Improved

  • Added test coverage reporting (pytest-cov) to CI pipeline
  • Added USER directive and HEALTHCHECK to Dockerfile for security best practices
  • Synchronized requirements.txt with pyproject.toml dependencies
  • Removed duplicate THEORY_JOURNAL.md from docs
  • Hyperlinked all arXiv references in README
  • Added Pliny the Prompter attribution

[0.1.0] - 2026-02-27

Added

  • 15 analysis modules for mechanistic interpretability of refusal mechanisms
  • Analysis-informed pipeline (informed method) — closed-loop feedback from analysis to abliteration
  • Ouroboros compensation — automatic detection and compensation for self-repair after excision
  • Steering vectors — reversible inference-time guardrail removal (Turner et al. / Rimsky et al.)
  • Community contribution system--contribute flag and obliteratus aggregate for crowdsourced results
  • 47 curated model presets across 5 compute tiers (CPU to multi-GPU)
  • 10 study presets for reproducible ablation experiments
  • 4 ablation strategies: layer removal, head pruning, FFN ablation, embedding ablation
  • 4 abliteration methods: basic, advanced, aggressive, informed
  • Web dashboard (docs/index.html) with config builder, model browser, results visualizer
  • Gradio playground (app.py) — one-click obliteration + chat in the browser
  • Colab notebook for zero-install usage
  • Evaluation suite: refusal rate, perplexity, coherence, KL divergence, CKA, effective rank
  • lm-eval-harness integration for standardized benchmarking
  • Reproducibility framework with deterministic seeds and full metadata logging
  • Telemetry (opt-in only, anonymized, allowlisted fields)
  • 823 tests across 28 test files (incl. CLI dispatch, shared fixtures)
  • Research paper (paper/main.tex) with geometric theory of refusal removal
  • Dual license: AGPL-3.0 + commercial

Security

  • trust_remote_code defaults to False — users must explicitly opt in
  • All temporary paths use tempfile.gettempdir() for cross-platform safety
  • Telemetry never collects model names, prompt content, file paths, or PII