Pnyx Habermas - Discourse Legibility Model (v3)

Named after Jurgen Habermas's Theory of Communicative Action (1981). This model makes discourse structure legible by extracting two validity dimensions from text.

Model

  • Base: cross-encoder/nli-deberta-v3-small (141M params)
  • Format: ONNX, FP16 (271 MB)
  • Performance: F1 0.974 (0.977 claim risk, 0.972 argument quality)
  • Inference: ONNX Runtime Web (WASM) for in-browser use

Outputs

Two binary classification heads:

Head Validity Claim Description
claim_risk Wahrheit (Truth) Are unsupported assertions present?
argument_quality Richtigkeit (Rightness) Is reasoning/evidence present?

Apply softmax to each head's logits. The [1] index gives the positive class probability.

Usage

import * as ort from 'onnxruntime-web';
import { AutoTokenizer } from '@huggingface/transformers';

const tokenizer = await AutoTokenizer.from_pretrained('onblueroses/pnyx-habermas');
const session = await ort.InferenceSession.create('model.onnx');

const { input_ids, attention_mask } = tokenizer(text, {
  padding: true, truncation: true, max_length: 256, return_tensors: 'np',
});

const output = await session.run({
  input_ids: new ort.Tensor('int64', input_ids.data, input_ids.dims),
  attention_mask: new ort.Tensor('int64', attention_mask.data, attention_mask.dims),
});

Training

  • 10K balanced samples (2,500 per cell) + 453 boundary examples (5x oversampled)
  • Focal loss (gamma=2) + label smoothing (0.05)
  • 5 epochs, lr=5e-6, batch size 32
  • Trained on T4 GPU via Modal

Part of Pnyx

This model powers the SEE layer of Pnyx, a listening infrastructure for public discourse built for the Agora Hackathon x TUM.ai E-Lab (April 2026).

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support