ARA version for 9B/27B/35B-A3B?

by Iwaku-Real - opened Mar 10

Mar 10

p-e-w recently introduced the new Arbitrary Rank Ablation (ARA) method to Heretic: https://www.reddit.com/r/LocalLLaMA/comments/1rnic0a/heretic_has_finally_defeated_gptoss_with_a_new/

I've heard concerns about your abliterated Qwen3.5s (maybe it was a different one I forgot) having trouble with long context. Considering it dropped gpt-oss-20b from 74/100 to 3/100 refusals at the same KLD, I can only imagine how it would do on Qwen3.5 👀

llmfan46

Owner Mar 10

p-e-w recently introduced the new Arbitrary Rank Ablation (ARA) method to Heretic: https://www.reddit.com/r/LocalLLaMA/comments/1rnic0a/heretic_has_finally_defeated_gptoss_with_a_new/

I am aware.

I've heard concerns about your abliterated Qwen3.5s (maybe it was a different one I forgot) having trouble with long context.

From what people reported it was related to NSFW content, the issue arose on models that reached 0/100 with MPOA and SOMA, after receiving a few reports from people reporting the same thing and then testing the same models but with refusals at 2-4/100 and having no issue I decided to delete the Qwen3.5 models who had 0/100.

Considering it dropped gpt-oss-20b from 74/100 to 3/100 refusals at the same KLD, I can only imagine how it would do on Qwen3.5 👀

I already did the last few models that I released with ARA, see:

https://huggingface.co/llmfan46/gemma-3-12b-it-heretic

https://huggingface.co/llmfan46/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled-heretic

https://huggingface.co/llmfan46/gpt-oss-120b-heretic-v2

https://huggingface.co/llmfan46/gpt-oss-120b-ultra-heretic

Iwaku-Real

Mar 10

Interesting, I you should definitely do more with ARA. Especially for the other base Qwen3.5s because 21/100 refusals on the Opus 4.6 distill is quite a bit more than I would've hoped for lol

llmfan46

Owner Mar 11

•

edited Mar 11

Interesting, I you should definitely do more with ARA. Especially for the other base Qwen3.5s because 21/100 refusals on the Opus 4.6 distill is quite a bit more than I would've hoped for lol

More refusals with very slightly worse KL divergence: https://huggingface.co/llmfan46/Qwen3.5-27B-heretic-v3

Stick with llmfan46/Qwen3.5-27B-heretic-v2.

SekkSea

Mar 18

There was a post made on the v3 comments section that caught my interest.
( https://huggingface.co/llmfan46/Qwen3.5-27B-heretic-v3/discussions/1 )

"My experience and testing so far actually indicates that this model is quite perceptive and remains creatively capable even compared to the original, without being overly tripped up by self-disclaimers; and actually gains performance on diagnostic eq_bench whereas v2 loses about 15 points on the diagnostic eq_bench (which is rather rough). They seem great. In terms of getting Qwen 3.5 27B out of their own way, looks like a good place to start."

The assertion seems to be that v3 is better for multi-turn emotional intelligence, in spite of having slightly higher KL divergence. I'm curious about your thoughts on the matter.

llmfan46

Owner Mar 18

•

edited Mar 18

There was a post made on the v3 comments section that caught my interest.
( https://huggingface.co/llmfan46/Qwen3.5-27B-heretic-v3/discussions/1 )

"My experience and testing so far actually indicates that this model is quite perceptive and remains creatively capable even compared to the original, without being overly tripped up by self-disclaimers; and actually gains performance on diagnostic eq_bench whereas v2 loses about 15 points on the diagnostic eq_bench (which is rather rough). They seem great. In terms of getting Qwen 3.5 27B out of their own way, looks like a good place to start."

The assertion seems to be that v3 is better for multi-turn emotional intelligence, in spite of having slightly higher KL divergence. I'm curious about your thoughts on the matter.

Yes, I did notice similar patterns based on the UGI leaderboard that you can find here: https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

Some models with higher refusals and higher KL divergence can score/perform better than models with lower refusals and lower KL divergence, one of my theory is that different ablations methods work differently and some are more aggressive than others, the more refusals you remove the more you take the risk of getting into "over-ablation" territory and removing things that could hurt the capabilities of the model, maybe something that is not really a refusal could be wrongly identified as a refusal and be ablated and as you go further below 10/100 refusals the risk of that happening grows higher?

For example according to the UGI Leaderboard, the difference in quality between the 0/100 refusals and the 2/100 refusals is quite big, here: https://huggingface.co/llmfan46/GLM-4.7-Flash-ultra-uncensored-heretic and here: https://huggingface.co/llmfan46/GLM-4.7-Flash-ultimate-uncensored-heretic, the KL divergence difference between the two is only 0.0051, that's basically nothing and yet the 0/100 refusals version performs quite worse than the 2/100 version.

llmfan46

Owner Mar 19

•

edited Mar 19

There was a post made on the v3 comments section that caught my interest.
( https://huggingface.co/llmfan46/Qwen3.5-27B-heretic-v3/discussions/1 )

"My experience and testing so far actually indicates that this model is quite perceptive and remains creatively capable even compared to the original, without being overly tripped up by self-disclaimers; and actually gains performance on diagnostic eq_bench whereas v2 loses about 15 points on the diagnostic eq_bench (which is rather rough). They seem great. In terms of getting Qwen 3.5 27B out of their own way, looks like a good place to start."

The assertion seems to be that v3 is better for multi-turn emotional intelligence, in spite of having slightly higher KL divergence. I'm curious about your thoughts on the matter.

Confirmed by the UGI (Uncensored General Intelligence) Leaderboard that v3 is better than v2: https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard

I think I am gonna do a v3 for Qwen3.5-35B-A3B ARA version, could have some interesting results.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment