PyTorch
Transformers
English
confidence-cartography
interpretability
causal-lm
confidence-calibration
mandela-effect
false-belief-detection
teacher-forcing
rho-eval
alignment
rho-guided-sft
contrastive-loss
calibration-repair
behavioral-audit
steering-vectors
mechanistic-interpretability
fidelity-bench
pythia
llama
mistral
qwen
gpt2
Eval Results (legacy)
Welcome to the community
The community tab is the place to discuss and collaborate with the HF community!