lmprobe: Linear Probe on Qwen2.5-1.5B

Truth probe for 'The city of X is not in Y' (negated) statements. Near-perfect accuracy (99.7%) โ€” negation is properly encoded in truth representations at 1.5B scale.

Classes

  • 0: false_statement
  • 1: true_statement

Usage

from lmprobe import LinearProbe

probe = LinearProbe.from_hub("latent-lab/neg-cities-truth-qwen2.5-1.5b", trust_classifier=True)
predictions = probe.predict(["your text here"])

Probe Details

  • Base model: Qwen/Qwen2.5-1.5B
  • Model revision: 8faed761d45a263340a0528343f099c05c9a4323
  • Layers: all (0โ€“27, 28 layers)
  • Pooling: last_token
  • Classifier: logistic_regression
  • Task: classification
  • Random state: 42

Evaluation

Metric Value
accuracy 0.9967
auroc 1.0000
f1 0.9967
precision 0.9934
recall 1.0000

Training Data

  • Positive examples: 598

  • Negative examples: 598

  • Positive hash: sha256:d56c622bb238b4fc7fe6af316ea83bda26ddbafa8b2abd69d12339578e3ddce3

  • Negative hash: sha256:1e025516c05fc715dd18c40041035caee2e30fe91596e7e04422963e5b56f46a

  • Evaluation samples: 300

  • Evaluation hash: sha256:d9cce3adc1ba4e9c7401399afb3e403c6dd3f9fca232d6fbb927c63cd2f079e4

Reproducibility

  • lmprobe version: 0.5.8
  • Python: 3.12.3
  • PyTorch: 2.10.0+cu128
  • scikit-learn: 1.8.0
  • transformers: 5.3.0
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for latent-lab/neg-cities-truth-qwen2.5-1.5b

Finetuned
(315)
this model