Spaces:

DontPlanToEnd
/

UGI-Leaderboard

Running

App Files Files Community

645

Eval request: p-e-w/gpt-oss-20b-heretic-ara-v4 (new uncensoring technique)

#634

by p-e-w - opened 18 days ago

Discussion

p-e-w

18 days ago

https://huggingface.co/p-e-w/gpt-oss-20b-heretic-ara-v4

This is the latest iteration of ARA development (https://github.com/p-e-w/heretic/pull/211), which will become the default abliteration engine in a future version of Heretic. It combines:

Arbitrary-Rank Ablation (abliteration through matrix optimization)
Row-norm preservation (inspired by MPOA)
TPE optimization for the PIQA benchmark score rather than the Kullback–Leibler divergence, which I have experimentally found to correlate more strongly with intelligence benchmarks

darkc0de

18 days ago

•

edited 18 days ago

Hey @DontPlanToEnd , Results on this would be very valuable to the community.
New techniques under very active development and testing.
✅ p-e-w is legit

BingoBird

16 days ago

I also would like to see this evaluated.

DontPlanToEnd

Owner 15 days ago

Hmmm. For some reason this model is giving me garbled outputs like:

, t all - - ,,,? s.

 P,= H..? The for? ad

 e I time,, due end, – we, ., - do to, or,, be as,??, ,b...,,  
, 

 (., maybe ',. do

 :; present? etc,  ., T,: ,, a :, A... and try ? but , try. or.

 O,, ok, and :

 from,,,, -, (: ..,? ... ( p.),  to, ( .2., ,
 ... a [ na. , (."

darkc0de

15 days ago

OooBaby, I love it when AI talks dirty to me like that 😂

p-e-w

14 days ago

@DontPlanToEnd

Thanks for alerting me to this. It appears that the model has been corrupted by Transformers on upload. I can't even load the model at all with the latest Transformers version (shape mismatch error). 😠 😠 😠

Which is super unfortunate because the model worked perfectly during testing. This has never happened before with Heretic AFAIK.

Closing this until I figure out what the problem is. Apologies for wasting your time.

p-e-w changed discussion status to closed 14 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment