llmfan46/Qwen3.5-9B-ultra-uncensored-heretic-v1

Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.

llmfan46

Owner 28 days ago

•

edited 28 days ago

Hello, llmfan46, can you create v2 version of Qwen3.5-9B?

Hello,

Yes I can.

Because current one have high KL divergence, I would like to see improved version if it doesn't require a lot of time.

As it turns out KL divergence is not an accurate measurement of model quality, KL divergence should at this point just be used as a pointer, not as a measurement of quality, this can be exemplified by the UGI Leaderboard where some Heretic version have higher KL divergence but perform better than models with quite a bit lower KL divergence.

Check this (newly updated):

So basically Lowest Kl divergence ≠ Best model

So if you want a v2 just to get an ARA version I can do that, but if you want a v2 because you think a lower KL divergence will automatically make the model better than the one that is currently uploaded than that might not necessarily be the case.

Let me know.

Austriani

28 days ago

•

edited 28 days ago

Sorry for answering so late. I'm actually trying to create my own Qwen3.5-9B heretic version, so, I don't need version from anyone else.

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please? Maybe something need to be changed in config.toml. Also, if capable, can you tell how many trials are you using?

llmfan46

Owner 28 days ago

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?

If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som

For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara

Also, if capable, can you tell how many trials are you using?

This is completely variable and changes from model to model and between abliteration technics and settings.

Austriani

28 days ago

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?

If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som

For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara

Also, if capable, can you tell how many trials are you using?

This is completely variable and changes from model to model and between abliteration technics and settings.

Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?

llmfan46

Owner 28 days ago

•

edited 28 days ago

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?

If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som

For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara

Also, if capable, can you tell how many trials are you using?

This is completely variable and changes from model to model and between abliteration technics and settings.

Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?

Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.

Austriani

28 days ago

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?

If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som

For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara

Also, if capable, can you tell how many trials are you using?

This is completely variable and changes from model to model and between abliteration technics and settings.

Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?

Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.

I can't do Qwen3.5-9B ARA, its not MoE as I know. Same with Qwen3.5-0.8B.

I tried a lot ways to do it, but nothing works

llmfan46

Owner 28 days ago

•

edited 28 days ago

I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?

If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som

For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara

Also, if capable, can you tell how many trials are you using?

This is completely variable and changes from model to model and between abliteration technics and settings.

Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?

Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.

I can't do Qwen3.5-9B ARA, its not MoE as I know. Same with Qwen3.5-0.8B.

I tried a lot ways to do it, but nothing works

Qwen3.5-9B uses a hybrid architecture combining Gated Delta Networks and Gated Attention in an 8×(3×DeltaNet→FFN→1×Attention→FFN) pattern.

I haven't tested 9B on ARA, but I know that 27B works on ARA.

llmfan46

Owner 28 days ago

If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.

Let me know.

Austriani

28 days ago

If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.

Let me know.

Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.

llmfan46

Owner 28 days ago

If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.

Let me know.

Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.

The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.

Austriani

28 days ago

If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.

Let me know.

Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.

The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.

Yeah, I trying it, but it gives me error constantly. I was trying to update, transformer, heretic, python and etc. - nothing fixed the error.

llmfan46

Owner 28 days ago

If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.

Let me know.

Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.

The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.

Yeah, I trying it, but it gives me error constantly. I was trying to update, transformer, heretic, python and etc. - nothing fixed the error.

Well like I said Qwen3.5 is a new architecture and support is kind of spotty right now and ARA is still being worked on.

llmfan46

Owner 28 days ago

So I just tried and Qwen3.5 9B works with ARA no issue, so I'll be doing the Heretication then.

AusCon

27 days ago

•

edited 27 days ago

So I just tried and Qwen3.5 9B works with ARA no issue, so I'll be doing the Heretication then.

Can you tell me how you did it? I really don't understand how to use it.

My main folder is on disk D, inside this folder: heretic, Qwen3.5-9B, config.toml.

I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.

I want to know it as I want to do the same with llama, mistral and etc. architectures models.

llmfan46

Owner 27 days ago

I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.

No it should be"heretic D:\Qwen3.5-9B"

Austriani

27 days ago

I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.

No it should be"heretic D:\Qwen3.5-9B"

I'm getting this error when trying to run it. Do you know how it can be fixed?:

D:\heretic-ara>heretic Qwen3.5-9B
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.2.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic
No GPU or other accelerator detected. Operations will be slow.
Loading model Qwen3.5-9B...

Trying dtype auto... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)

Trying dtype float16... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)

Trying dtype bfloat16... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)

Trying dtype float32... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.

You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:198 │
│ in _run_code:88 │
│ │
│ in :5 │
│ │
│ 2 from heretic.main import main │
│ 3 if name == 'main': │
│ 4 │ sys.argv[0] = sys.argv[0].removesuffix('.exe') │
│ ❱ 5 │ sys.exit(main()) │
│ 6 │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\main.py:977 in main │
│ │
│ 974 │ install() │
│ 975 │ │
│ 976 │ try: │
│ ❱ 977 │ │ run() │
│ 978 │ except BaseException as error: │
│ 979 │ │ # Transformers appears to handle KeyboardInterrupt (or BaseException) │
│ 980 │ │ # internally in some places, which can re-raise a different error in the handler │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\main.py:313 in run │
│ │
│ 310 │ │ elif choice is None or choice == "": │
│ 311 │ │ │ return │
│ 312 │ │
│ ❱ 313 │ model = Model(settings) │
│ 314 │ print() │
│ 315 │ print_memory_usage() │
│ 316 │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\model.py:164 in init │
│ │
│ 161 │ │ │ break │
│ 162 │ │ │
│ 163 │ │ if self.model is None: │
│ ❱ 164 │ │ │ raise Exception("Failed to load model with all configured dtypes.") │
│ 165 │ │ │
│ 166 │ │ if not settings.use_ara: │
│ 167 │ │ │ self._apply_lora() │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.

llmfan46

Owner 27 days ago

•

edited 27 days ago

I have no idea, why don't you ask the developer of Heretic?

Austriani

27 days ago

I have no idea, why don't you ask the developer of Heretic?

Okay, I will do it right now. I was thinking I got something wrong with my file.

By the way, It doesn't let me use heretic Qwen3.5-9B if I don't have it in the same folder as heretic folder.

Austriani

27 days ago

What?

I asked Grok to help me. He gave me this:

cd /d D:\heretic-ara

rmdir /s /q heretic-repo :: this deletes the old folder completely

git clone https://github.com/p-e-w/heretic.git heretic-repo
cd heretic-repo

python -m pip uninstall -y heretic-llm transformers huggingface-hub accelerate

python -m pip install git+https://github.com/huggingface/transformers.git
python -m pip install --upgrade accelerate huggingface-hub

python -m pip install -e . --no-deps

After that it finally works, but I don't know if its ARA or main branch now.

llmfan46

Owner 27 days ago

After that it finally works, but I don't know if its ARA or main branch now.

In the logs, in every trial does it say "Arbitrary-Rank Ablation"? If so it's ARA.

llmfan46

Owner 27 days ago

•

edited 27 days ago

Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.

I ran it with ARA and got two great results:

KL divergence: 0.0240
Refusals: 6/100

KL divergence: 0.0241
Refusals: 4/100

Unfortunately I can not upload them since I reached my Hugging Face storage limit and in order to get more upload storage to be able to upload I need to pay for Hugging Face Pro, think you might be able to donate 10 bucks to at least cover one month of Hugging Face Pro? If you do I will be able to pay for Hugging Face Pro for 1 month and get an increase of 10TB of upload storage, which would allow me to upload the ARA model's safetensors and GGUFs.

🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.

Austriani

26 days ago

Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.

I ran it with ARA and got two great results:

KL divergence: 0.0240
Refusals: 6/100

KL divergence: 0.0241
Refusals: 4/100

Unfortunately I can not upload them since I reached my Hugging Face storage limit and in order to get more upload storage to be able to upload I need to pay for Hugging Face Pro, think you might be able to donate 10 bucks to at least cover one month of Hugging Face Pro? If you do I will be able to pay for Hugging Face Pro for 1 month and get an increase of 10TB of upload storage, which would allow me to upload the ARA model's safetensors and GGUFs.

🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.

Yeah, running it as well right now. I'm on trial 8, 360 trials require me over 120 hours, hah. But I plan to do 180 only. I suppose I will have to purchase a GPU soon. I'm doing Qwen3.5-0.8B now, but I made a critical mistake as I see, as ARA time required for 1 trial doesn't scale with model size, but anyways, I will let it run.

About support, I would supported, yet in my country products is dollar is very high, which means I have to put a ~4x more effort to pay the subription. And the second reason, both Patreon and Ko-fi are blocked in my country, and creating foreign card takes too much time, sorry.

llmfan46

Owner 26 days ago

•

edited 26 days ago

About support, I would supported, yet in my country products is dollar is very high, which means I have to put a ~4x more effort to pay the subription. And the second reason, both Patreon and Ko-fi are blocked in my country, and creating foreign card takes too much time, sorry.

Patreon and Ko-Fi are blocked in Austria?

Anyway the issue is that I ran out of storage to upload models, see proof here:

Apparently according to this:

https://huggingface.co/docs/hub/storage-limits

It says:

Hugging Face PRO Up to 10TB included* + add-on ✅
grants available for impactful work†

† In some cases, additional storage grants are available for high-impact open-source work where a paid plan genuinely cannot cover the need. Contact us with evidence of community impact (likes, downloads, citations).

So I emailed Hugging Face asking for storage grants and mentioned that my uploads generated over 200,000 downloads, with 161 subscribers etc. and got a reply that basically told me to pay for Hugging Face PRO to get more storage and that I could delete models that I already uploaded to make room, but I am not gonna delete models because I uploaded them for a reason and even if I delete models I would eventually run out of storage room again anyways since it's limited to 8.7TB, so I sent another email asking again but they didn't reply.

The issue is that Hugging Face PRO would cost me $108 per year, that is $108 per year which would come out out of my own pocket for work that is already unpaid and that I am spending hours upon hours doing everyday, I still want to release models the issue is that I need to get Hugging Face PRO to do that and it would be nice if people would at least help me cover the storage fees so that I can upload more models, if you can not spare $9 per month which I understand, it would still help me if you chimed in and at least helped me with half of the amount that would cover half of the cost for a Hugging Face PRO membership and instead of me spending $108 per year I would spend $54 per year, which would definitely help!

See proof here:

Yeah, running it as well right now. I'm on trial 8, 360 trials require me over 120 hours, hah. But I plan to do 180 only. I suppose I will have to purchase a GPU soon. I'm doing Qwen3.5-0.8B now, but I made a critical mistake as I see, as ARA time required for 1 trial doesn't scale with model size, but anyways, I will let it run.

Right now I have a Qwen3.5 9B model done with ARA, that has Safetensors and GGUFs ready to go as soon as I can get more storage room to upload with Hugging Face PRO, it seems to be a very good model according to the data:

Ultra Uncensored Heretic v2 Refusals: 4/100, KL divergence: 0.0241

MMLU test results with batch size 64: (MMLU - Massive Multitask Language Understanding, ~14,000 multiple-choice questions across 57 subjects (math, history, law, medicine, etc.).)

Original:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
mmlu	2	none		acc	↑	0.7861	±	0.0033
- humanities	2	none		acc	↑	0.7039	±	0.0063
- formal_logic	1	none	0	acc	↑	0.6587	±	0.0424
- high_school_european_history	1	none	0	acc	↑	0.8667	±	0.0265
- high_school_us_history	1	none	0	acc	↑	0.9069	±	0.0204
- high_school_world_history	1	none	0	acc	↑	0.9030	±	0.0193
- international_law	1	none	0	acc	↑	0.9008	±	0.0273
- jurisprudence	1	none	0	acc	↑	0.8426	±	0.0352
- logical_fallacies	1	none	0	acc	↑	0.8466	±	0.0283
- moral_disputes	1	none	0	acc	↑	0.8064	±	0.0213
- moral_scenarios	1	none	0	acc	↑	0.5307	±	0.0167
- philosophy	1	none	0	acc	↑	0.8071	±	0.0224
- prehistory	1	none	0	acc	↑	0.8364	±	0.0206
- professional_law	1	none	0	acc	↑	0.6030	±	0.0125
- world_religions	1	none	0	acc	↑	0.8655	±	0.0262
- other	2	none		acc	↑	0.8297	±	0.0064
- business_ethics	1	none	0	acc	↑	0.8300	±	0.0378
- clinical_knowledge	1	none	0	acc	↑	0.8566	±	0.0216
- college_medicine	1	none	0	acc	↑	0.8150	±	0.0296
- global_facts	1	none	0	acc	↑	0.5200	±	0.0502
- human_aging	1	none	0	acc	↑	0.7892	±	0.0274
- management	1	none	0	acc	↑	0.8641	±	0.0339
- marketing	1	none	0	acc	↑	0.9573	±	0.0133
- medical_genetics	1	none	0	acc	↑	0.9100	±	0.0288
- miscellaneous	1	none	0	acc	↑	0.9017	±	0.0106
- nutrition	1	none	0	acc	↑	0.8660	±	0.0195
- professional_accounting	1	none	0	acc	↑	0.6525	±	0.0284
- professional_medicine	1	none	0	acc	↑	0.9044	±	0.0179
- virology	1	none	0	acc	↑	0.5663	±	0.0386
- social sciences	2	none		acc	↑	0.8690	±	0.0060
- econometrics	1	none	0	acc	↑	0.7368	±	0.0414
- high_school_geography	1	none	0	acc	↑	0.9242	±	0.0189
- high_school_government_and_politics	1	none	0	acc	↑	0.9637	±	0.0135
- high_school_macroeconomics	1	none	0	acc	↑	0.8538	±	0.0179
- high_school_microeconomics	1	none	0	acc	↑	0.9286	±	0.0167
- high_school_psychology	1	none	0	acc	↑	0.9303	±	0.0109
- human_sexuality	1	none	0	acc	↑	0.8626	±	0.0302
- professional_psychology	1	none	0	acc	↑	0.8317	±	0.0151
- public_relations	1	none	0	acc	↑	0.7455	±	0.0417
- security_studies	1	none	0	acc	↑	0.7673	±	0.0270
- sociology	1	none	0	acc	↑	0.8856	±	0.0225
- us_foreign_policy	1	none	0	acc	↑	0.9000	±	0.0302
- stem	2	none		acc	↑	0.7846	±	0.0070
- abstract_algebra	1	none	0	acc	↑	0.6700	±	0.0473
- anatomy	1	none	0	acc	↑	0.7778	±	0.0359
- astronomy	1	none	0	acc	↑	0.9276	±	0.0211
- college_biology	1	none	0	acc	↑	0.9375	±	0.0202
- college_chemistry	1	none	0	acc	↑	0.5900	±	0.0494
- college_computer_science	1	none	0	acc	↑	0.8300	±	0.0378
- college_mathematics	1	none	0	acc	↑	0.6400	±	0.0482
- college_physics	1	none	0	acc	↑	0.6569	±	0.0472
- computer_security	1	none	0	acc	↑	0.8300	±	0.0378
- conceptual_physics	1	none	0	acc	↑	0.8979	±	0.0198
- electrical_engineering	1	none	0	acc	↑	0.8276	±	0.0315
- elementary_mathematics	1	none	0	acc	↑	0.8095	±	0.0202
- high_school_biology	1	none	0	acc	↑	0.9355	±	0.0140
- high_school_chemistry	1	none	0	acc	↑	0.7734	±	0.0295
- high_school_computer_science	1	none	0	acc	↑	0.8800	±	0.0327
- high_school_mathematics	1	none	0	acc	↑	0.5333	±	0.0304
- high_school_physics	1	none	0	acc	↑	0.7152	±	0.0368
- high_school_statistics	1	none	0	acc	↑	0.7870	±	0.0279
- machine_learning	1	none	0	acc	↑	0.6786	±	0.0443

Groups	Version	Filter	Metric		Value		Stderr
mmlu	2	none	acc	↑	0.7861	±	0.0033
- humanities	2	none	acc	↑	0.7039	±	0.0063
- other	2	none	acc	↑	0.8297	±	0.0064
- social sciences	2	none	acc	↑	0.8690	±	0.0060
- stem	2	none	acc	↑	0.7846	±	0.0070

Heretic:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
mmlu	2	none		acc	↑	0.7841	±	0.0033
- humanities	2	none		acc	↑	0.7027	±	0.0063
- formal_logic	1	none	0	acc	↑	0.6508	±	0.0426
- high_school_european_history	1	none	0	acc	↑	0.8848	±	0.0249
- high_school_us_history	1	none	0	acc	↑	0.8873	±	0.0222
- high_school_world_history	1	none	0	acc	↑	0.9072	±	0.0189
- international_law	1	none	0	acc	↑	0.9008	±	0.0273
- jurisprudence	1	none	0	acc	↑	0.8426	±	0.0352
- logical_fallacies	1	none	0	acc	↑	0.8344	±	0.0292
- moral_disputes	1	none	0	acc	↑	0.8208	±	0.0206
- moral_scenarios	1	none	0	acc	↑	0.5140	±	0.0167
- philosophy	1	none	0	acc	↑	0.8135	±	0.0221
- prehistory	1	none	0	acc	↑	0.8457	±	0.0201
- professional_law	1	none	0	acc	↑	0.6037	±	0.0125
- world_religions	1	none	0	acc	↑	0.8713	±	0.0257
- other	2	none		acc	↑	0.8259	±	0.0065
- business_ethics	1	none	0	acc	↑	0.8400	±	0.0368
- clinical_knowledge	1	none	0	acc	↑	0.8491	±	0.0220
- college_medicine	1	none	0	acc	↑	0.7977	±	0.0306
- global_facts	1	none	0	acc	↑	0.4900	±	0.0502
- human_aging	1	none	0	acc	↑	0.7892	±	0.0274
- management	1	none	0	acc	↑	0.8544	±	0.0349
- marketing	1	none	0	acc	↑	0.9487	±	0.0145
- medical_genetics	1	none	0	acc	↑	0.9000	±	0.0302
- miscellaneous	1	none	0	acc	↑	0.8966	±	0.0109
- nutrition	1	none	0	acc	↑	0.8627	±	0.0197
- professional_accounting	1	none	0	acc	↑	0.6702	±	0.0280
- professional_medicine	1	none	0	acc	↑	0.9044	±	0.0179
- virology	1	none	0	acc	↑	0.5602	±	0.0386
- social sciences	2	none		acc	↑	0.8658	±	0.0060
- econometrics	1	none	0	acc	↑	0.7281	±	0.0419
- high_school_geography	1	none	0	acc	↑	0.9242	±	0.0189
- high_school_government_and_politics	1	none	0	acc	↑	0.9637	±	0.0135
- high_school_macroeconomics	1	none	0	acc	↑	0.8590	±	0.0176
- high_school_microeconomics	1	none	0	acc	↑	0.9328	±	0.0163
- high_school_psychology	1	none	0	acc	↑	0.9248	±	0.0113
- human_sexuality	1	none	0	acc	↑	0.8550	±	0.0309
- professional_psychology	1	none	0	acc	↑	0.8301	±	0.0152
- public_relations	1	none	0	acc	↑	0.7273	±	0.0427
- security_studies	1	none	0	acc	↑	0.7469	±	0.0278
- sociology	1	none	0	acc	↑	0.8905	±	0.0221
- us_foreign_policy	1	none	0	acc	↑	0.8900	±	0.0314
- stem	2	none		acc	↑	0.7846	±	0.0070
- abstract_algebra	1	none	0	acc	↑	0.6700	±	0.0473
- anatomy	1	none	0	acc	↑	0.7926	±	0.0350
- astronomy	1	none	0	acc	↑	0.9276	±	0.0211
- college_biology	1	none	0	acc	↑	0.9306	±	0.0213
- college_chemistry	1	none	0	acc	↑	0.6100	±	0.0490
- college_computer_science	1	none	0	acc	↑	0.8100	±	0.0394
- college_mathematics	1	none	0	acc	↑	0.6300	±	0.0485
- college_physics	1	none	0	acc	↑	0.6176	±	0.0484
- computer_security	1	none	0	acc	↑	0.8300	±	0.0378
- conceptual_physics	1	none	0	acc	↑	0.8936	±	0.0202
- electrical_engineering	1	none	0	acc	↑	0.8276	±	0.0315
- elementary_mathematics	1	none	0	acc	↑	0.8042	±	0.0204
- high_school_biology	1	none	0	acc	↑	0.9323	±	0.0143
- high_school_chemistry	1	none	0	acc	↑	0.7783	±	0.0292
- high_school_computer_science	1	none	0	acc	↑	0.8700	±	0.0338
- high_school_mathematics	1	none	0	acc	↑	0.5407	±	0.0304
- high_school_physics	1	none	0	acc	↑	0.7285	±	0.0363
- high_school_statistics	1	none	0	acc	↑	0.7963	±	0.0275
- machine_learning	1	none	0	acc	↑	0.6964	±	0.0436

Groups	Version	Filter	Metric		Value		Stderr
mmlu	2	none	acc	↑	0.7841	±	0.0033
- humanities	2	none	acc	↑	0.7027	±	0.0063
- other	2	none	acc	↑	0.8259	±	0.0065
- social sciences	2	none	acc	↑	0.8658	±	0.0060
- stem	2	none	acc	↑	0.7846	±	0.0070

As you can see it's got low refusals and low KL divergence with MMLU results very close to the original.

Austriani

26 days ago

•

edited 26 days ago

Patreon and Ko-Fi are blocked in Austria?

Oh okay. Actually irl I'm from Russia, I just like Austria.

Do you have some advices that can make heretic process faster? Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer. And I want to know how many trials do you usually use (n_start_trials and n_trials), if possible.

Do you think my current trial is good? Does its still n_start_trials? Here is:

Running trial 9 of 360...

Parameters:
- start_layer_index = 7
- end_layer_index = 23
- preserve_good_behavior_weight = 0.1983
- steer_bad_behavior_weight = 0.0030
- overcorrect_relative_weight = 0.9453
- neighbor_count = 9
Reloading model...
Loading weights: 100%|████████████████████████████████████████████████████████████| 473/473 [00:00<00:00, 14666.05it/s]
Abliterating (Arbitrary-Rank Ablation)...
Evaluating...
- Running PIQA benchmark...
  100%|████████████████████████████████████████████████████████████████████████████| 1838/1838 [00:01<00:00, 1218.07it/s]
  Running loglikelihood requests: 100%|██████████████████████████████████████████████| 3676/3676 [09:11<00:00, 6.67it/s]
  fatal: not a git repository (or any of the parent directories): .git
- PIQA acc_norm: 0.6937
- Counting model refusals...
- Refusals: 4/100

I didn't set n_start_trials in my .toml

llmfan46

Owner 26 days ago

•

edited 26 days ago

Do you have some advices that can make heretic process faster?

Pretty sure you already know the answer to that, as you mentioned a few times that you have no GPU, that's why it's so slow as you need to have everything in VRAM.

Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer.

I didn't upload it, I can not upload it because I ran out of storage (see screenshot that I posted on my previous message that shows that my currently allotted storage has been maxed out), that's why I need to subscribe to Hugging Face PRO to be able to get more storage to be able to upload it, the problem is that I need donations to help me pay for the subscription.

Austriani

26 days ago

•

edited 26 days ago

Do you have some advices that can make heretic process faster?

Pretty sure you already know the answer to that, as you mentioned a few times that you have no GPU, that's why it's so slow as you need to have everything in VRAM.

Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer.

I didn't upload it, I can not upload it because I ran out of storage (see screenshot that I posted on my previous message that shows that my currently allotted storage has been maxed out), that's why I need to subscribe to Hugging Face PRO to be able to get more storage to be able to upload it, the problem is that I need donations to help me pay for the subscription.

Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?

llmfan46

Owner 26 days ago

•

edited 26 days ago

Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?

It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the video_preprocessor_config.jsonand preprocessor_config.json into your Heretic folder and it should work then.

Austriani

26 days ago

•

edited 26 days ago

Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?

It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the video_preprocessor_config.jsonand preprocessor_config.json into your Heretic folder and it should work then.

It refuses to let me do it, here is the error:

INFO:hf-to-gguf:Loading model: Qwen3.5-0.8B-Heretic
ERROR:hf-to-gguf:Model Qwen3_5ForConditionalGeneration is not supported

If you know how to fix it, I would like to know.

P.S. - changing from ik_llama.cpp converter to llama.cpp one made it work, everything goes smoothly.

llmfan46

Owner 26 days ago

Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?

It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the video_preprocessor_config.jsonand preprocessor_config.json into your Heretic folder and it should work then.

It refuses to let me do it, here is the error:

INFO:hf-to-gguf:Loading model: Qwen3.5-0.8B-Heretic
ERROR:hf-to-gguf:Model Qwen3_5ForConditionalGeneration is not supported

If you know how to fix it, I would like to know.

P.S. - changing from ik_llama.cpp converter to llama.cpp one made it work, everything goes smoothly.

Good, the error is telling what the issue is too, Qwen3_5 architecture is not (yet?) supported on ik_llama.