Eval request: p-e-w/gpt-oss-20b-heretic-ara-v4 (new uncensoring technique)
https://huggingface.co/p-e-w/gpt-oss-20b-heretic-ara-v4
This is the latest iteration of ARA development (https://github.com/p-e-w/heretic/pull/211), which will become the default abliteration engine in a future version of Heretic. It combines:
- Arbitrary-Rank Ablation (abliteration through matrix optimization)
- Row-norm preservation (inspired by MPOA)
- TPE optimization for the PIQA benchmark score rather than the Kullback–Leibler divergence, which I have experimentally found to correlate more strongly with intelligence benchmarks
Hey @DontPlanToEnd , Results on this would be very valuable to the community.
New techniques under very active development and testing.
✅ p-e-w is legit
I also would like to see this evaluated.
Hmmm. For some reason this model is giving me garbled outputs like:
, t all - - ,,,? s.
P,= H..? The for? ad
e I time,, due end, – we, ., - do to, or,, be as,??, ,b...,,
,
(., maybe ',. do
:; present? etc, ., T,: ,, a :, A... and try ? but , try. or.
O,, ok, and :
from,,,, -, (: ..,? ... ( p.), to, ( .2., ,
... a [ na. , (."
OooBaby, I love it when AI talks dirty to me like that 😂
Thanks for alerting me to this. It appears that the model has been corrupted by Transformers on upload. I can't even load the model at all with the latest Transformers version (shape mismatch error). 😠 😠 😠
Which is super unfortunate because the model worked perfectly during testing. This has never happened before with Heretic AFAIK.
Closing this until I figure out what the problem is. Apologies for wasting your time.