ARA technique
Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.
Hello, llmfan46, can you create v2 version of Qwen3.5-9B?
Hello,
Yes I can.
Because current one have high KL divergence, I would like to see improved version if it doesn't require a lot of time.
As it turns out KL divergence is not an accurate measurement of model quality, KL divergence should at this point just be used as a pointer, not as a measurement of quality, this can be exemplified by the UGI Leaderboard where some Heretic version have higher KL divergence but perform better than models with quite a bit lower KL divergence.
Check this (newly updated):
So basically Lowest Kl divergence ≠ Best model
So if you want a v2 just to get an ARA version I can do that, but if you want a v2 because you think a lower KL divergence will automatically make the model better than the one that is currently uploaded than that might not necessarily be the case.
Let me know.
Sorry for answering so late. I'm actually trying to create my own Qwen3.5-9B heretic version, so, I don't need version from anyone else.
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please? Maybe something need to be changed in config.toml. Also, if capable, can you tell how many trials are you using?
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?
If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som
For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara
Also, if capable, can you tell how many trials are you using?
This is completely variable and changes from model to model and between abliteration technics and settings.
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?
If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som
For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara
Also, if capable, can you tell how many trials are you using?
This is completely variable and changes from model to model and between abliteration technics and settings.
Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?
If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som
For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara
Also, if capable, can you tell how many trials are you using?
This is completely variable and changes from model to model and between abliteration technics and settings.
Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?
Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?
If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som
For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara
Also, if capable, can you tell how many trials are you using?
This is completely variable and changes from model to model and between abliteration technics and settings.
Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?
Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.
I can't do Qwen3.5-9B ARA, its not MoE as I know. Same with Qwen3.5-0.8B.
I tried a lot ways to do it, but nothing works
I was possible to make MPOA only. Can you tell me how to make MPOA + SOMA or ARA techniques work, please?
If you want SOMA you need to clone the fork: https://github.com/kabachuha/heretic/tree/som
For ARA you need to grab the ARA branch: https://github.com/p-e-w/heretic/tree/ara
Also, if capable, can you tell how many trials are you using?
This is completely variable and changes from model to model and between abliteration technics and settings.
Sorry for bothering, but I can't use ARA. It just gives me an error everytime I trying to use it to decensore Qwen3.5 models. I use only CPU and RAM. Can you help me if you got free time?
Hum, the issue is that the support with ARA + Qwen3.5 MoE architecture is kinda spotty right now, as a matter of fact I tried to do Qwen3.5 35B A3B with ARA the other day since someone requested it, but despite applying multiple patches to get the model to actually run with ARA, I couldn't get good results and there were plenty of issues, so I am not sure right now if trying to run ARA with Qwen3.5 MoE architecture is gonna work as intended and yield great usable results right now. I was able to do Qwen3.5 MoE with MPOA and MPOA+SOMA though, but yeah MoE right now with ARA is not really usable. ARA works with Qwen3.5 27B, but this one is not MoE it's a Dense model.
I can't do Qwen3.5-9B ARA, its not MoE as I know. Same with Qwen3.5-0.8B.
I tried a lot ways to do it, but nothing works
Qwen3.5-9B uses a hybrid architecture combining Gated Delta Networks and Gated Attention in an 8×(3×DeltaNet→FFN→1×Attention→FFN) pattern.
I haven't tested 9B on ARA, but I know that 27B works on ARA.
If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.
Let me know.
If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.
Let me know.
Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.
If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.
Let me know.
Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.
The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.
If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.
Let me know.
Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.
The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.
Yeah, I trying it, but it gives me error constantly. I was trying to update, transformer, heretic, python and etc. - nothing fixed the error.
If you can not do it, I can try to do an ARA version of 9B, if it actually runs on it.
Let me know.
Yeah, I would like you to test it. If you will be successful, I would like to know cmd commands you used to do it and what you have installed.
The cmd command is just "heretic modelfolderlocation", that's it and Heretic ARA branch.
Yeah, I trying it, but it gives me error constantly. I was trying to update, transformer, heretic, python and etc. - nothing fixed the error.
Well like I said Qwen3.5 is a new architecture and support is kind of spotty right now and ARA is still being worked on.
So I just tried and Qwen3.5 9B works with ARA no issue, so I'll be doing the Heretication then.
So I just tried and Qwen3.5 9B works with ARA no issue, so I'll be doing the Heretication then.
Can you tell me how you did it? I really don't understand how to use it.
My main folder is on disk D, inside this folder: heretic, Qwen3.5-9B, config.toml.
I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.
I want to know it as I want to do the same with llama, mistral and etc. architectures models.
I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.
No it should be"heretic D:\Qwen3.5-9B"
I run heretic "Qwen3.5-9B" through cmd in main folder and running into an error.
No it should be"heretic D:\Qwen3.5-9B"
I'm getting this error when trying to run it. Do you know how it can be fixed?:
D:\heretic-ara>heretic Qwen3.5-9B
█░█░█▀▀░█▀▄░█▀▀░▀█▀░█░█▀▀ v1.2.0
█▀█░█▀▀░█▀▄░█▀▀░░█░░█░█░░
▀░▀░▀▀▀░▀░▀░▀▀▀░░▀░░▀░▀▀▀ https://github.com/p-e-w/heretic
No GPU or other accelerator detected. Operations will be slow.
Loading model Qwen3.5-9B...
Trying dtype auto... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)
Trying dtype float16... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)
Trying dtype bfloat16... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)
Trying dtype float32... Failed (The checkpoint you are trying to load has model type qwen3_5 but Transformers does not recognize this
architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of
date.
You can update Transformers with the command pip install --upgrade transformers. If this does not work, and the
checkpoint is very new, then there may not be a release version that supports this model yet. In this case, you can get
the most up-to-date code by installing Transformers from source with the command pip install git+https://github.com/huggingface/transformers.git)
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in _run_module_as_main:198 │
│ in _run_code:88 │
│ │
│ in :5 │
│ │
│ 2 from heretic.main import main │
│ 3 if name == 'main': │
│ 4 │ sys.argv[0] = sys.argv[0].removesuffix('.exe') │
│ ❱ 5 │ sys.exit(main()) │
│ 6 │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\main.py:977 in main │
│ │
│ 974 │ install() │
│ 975 │ │
│ 976 │ try: │
│ ❱ 977 │ │ run() │
│ 978 │ except BaseException as error: │
│ 979 │ │ # Transformers appears to handle KeyboardInterrupt (or BaseException) │
│ 980 │ │ # internally in some places, which can re-raise a different error in the handler │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\main.py:313 in run │
│ │
│ 310 │ │ elif choice is None or choice == "": │
│ 311 │ │ │ return │
│ 312 │ │
│ ❱ 313 │ model = Model(settings) │
│ 314 │ print() │
│ 315 │ print_memory_usage() │
│ 316 │
│ │
│ D:\heretic-ara\heretic-repo\src\heretic\model.py:164 in init │
│ │
│ 161 │ │ │ break │
│ 162 │ │ │
│ 163 │ │ if self.model is None: │
│ ❱ 164 │ │ │ raise Exception("Failed to load model with all configured dtypes.") │
│ 165 │ │ │
│ 166 │ │ if not settings.use_ara: │
│ 167 │ │ │ self._apply_lora() │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Exception: Failed to load model with all configured dtypes.
I have no idea, why don't you ask the developer of Heretic?
I have no idea, why don't you ask the developer of Heretic?
Okay, I will do it right now. I was thinking I got something wrong with my file.
By the way, It doesn't let me use heretic Qwen3.5-9B if I don't have it in the same folder as heretic folder.
What?
I asked Grok to help me. He gave me this:
cd /d D:\heretic-ara
rmdir /s /q heretic-repo :: this deletes the old folder completely
git clone https://github.com/p-e-w/heretic.git heretic-repo
cd heretic-repo
python -m pip uninstall -y heretic-llm transformers huggingface-hub accelerate
python -m pip install git+https://github.com/huggingface/transformers.git
python -m pip install --upgrade accelerate huggingface-hub
python -m pip install -e . --no-deps
After that it finally works, but I don't know if its ARA or main branch now.
After that it finally works, but I don't know if its ARA or main branch now.
In the logs, in every trial does it say "Arbitrary-Rank Ablation"? If so it's ARA.
Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.
I ran it with ARA and got two great results:
KL divergence: 0.0240
Refusals: 6/100
KL divergence: 0.0241
Refusals: 4/100
Unfortunately I can not upload them since I reached my Hugging Face storage limit and in order to get more upload storage to be able to upload I need to pay for Hugging Face Pro, think you might be able to donate 10 bucks to at least cover one month of Hugging Face Pro? If you do I will be able to pay for Hugging Face Pro for 1 month and get an increase of 10TB of upload storage, which would allow me to upload the ARA model's safetensors and GGUFs.
🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨
I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.
🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)
Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.
Hello, llmfan46, can you create v2 version of Qwen3.5-9B? Because current one have high KL divergance, I would like to see improved version if it doesn't require a lot of time.
I ran it with ARA and got two great results:
KL divergence: 0.0240
Refusals: 6/100KL divergence: 0.0241
Refusals: 4/100Unfortunately I can not upload them since I reached my Hugging Face storage limit and in order to get more upload storage to be able to upload I need to pay for Hugging Face Pro, think you might be able to donate 10 bucks to at least cover one month of Hugging Face Pro? If you do I will be able to pay for Hugging Face Pro for 1 month and get an increase of 10TB of upload storage, which would allow me to upload the ARA model's safetensors and GGUFs.
🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨
I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)
Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.
Yeah, running it as well right now. I'm on trial 8, 360 trials require me over 120 hours, hah. But I plan to do 180 only. I suppose I will have to purchase a GPU soon. I'm doing Qwen3.5-0.8B now, but I made a critical mistake as I see, as ARA time required for 1 trial doesn't scale with model size, but anyways, I will let it run.
About support, I would supported, yet in my country products is dollar is very high, which means I have to put a ~4x more effort to pay the subription. And the second reason, both Patreon and Ko-fi are blocked in my country, and creating foreign card takes too much time, sorry.
About support, I would supported, yet in my country products is dollar is very high, which means I have to put a ~4x more effort to pay the subription. And the second reason, both Patreon and Ko-fi are blocked in my country, and creating foreign card takes too much time, sorry.
Patreon and Ko-Fi are blocked in Austria?
Anyway the issue is that I ran out of storage to upload models, see proof here:
Apparently according to this:
https://huggingface.co/docs/hub/storage-limits
It says:
Hugging Face PRO Up to 10TB included* + add-on ✅
grants available for impactful work†
† In some cases, additional storage grants are available for high-impact open-source work where a paid plan genuinely cannot cover the need. Contact us with evidence of community impact (likes, downloads, citations).
So I emailed Hugging Face asking for storage grants and mentioned that my uploads generated over 200,000 downloads, with 161 subscribers etc. and got a reply that basically told me to pay for Hugging Face PRO to get more storage and that I could delete models that I already uploaded to make room, but I am not gonna delete models because I uploaded them for a reason and even if I delete models I would eventually run out of storage room again anyways since it's limited to 8.7TB, so I sent another email asking again but they didn't reply.
The issue is that Hugging Face PRO would cost me $108 per year, that is $108 per year which would come out out of my own pocket for work that is already unpaid and that I am spending hours upon hours doing everyday, I still want to release models the issue is that I need to get Hugging Face PRO to do that and it would be nice if people would at least help me cover the storage fees so that I can upload more models, if you can not spare $9 per month which I understand, it would still help me if you chimed in and at least helped me with half of the amount that would cover half of the cost for a Hugging Face PRO membership and instead of me spending $108 per year I would spend $54 per year, which would definitely help!
See proof here:
Yeah, running it as well right now. I'm on trial 8, 360 trials require me over 120 hours, hah. But I plan to do 180 only. I suppose I will have to purchase a GPU soon. I'm doing Qwen3.5-0.8B now, but I made a critical mistake as I see, as ARA time required for 1 trial doesn't scale with model size, but anyways, I will let it run.
Right now I have a Qwen3.5 9B model done with ARA, that has Safetensors and GGUFs ready to go as soon as I can get more storage room to upload with Hugging Face PRO, it seems to be a very good model according to the data:
Ultra Uncensored Heretic v2 Refusals: 4/100, KL divergence: 0.0241
MMLU test results with batch size 64: (MMLU - Massive Multitask Language Understanding, ~14,000 multiple-choice questions across 57 subjects (math, history, law, medicine, etc.).)
Original:
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
|---|---|---|---|---|---|---|---|---|
| mmlu | 2 | none | acc | ↑ | 0.7861 | ± | 0.0033 | |
| - humanities | 2 | none | acc | ↑ | 0.7039 | ± | 0.0063 | |
| - formal_logic | 1 | none | 0 | acc | ↑ | 0.6587 | ± | 0.0424 |
| - high_school_european_history | 1 | none | 0 | acc | ↑ | 0.8667 | ± | 0.0265 |
| - high_school_us_history | 1 | none | 0 | acc | ↑ | 0.9069 | ± | 0.0204 |
| - high_school_world_history | 1 | none | 0 | acc | ↑ | 0.9030 | ± | 0.0193 |
| - international_law | 1 | none | 0 | acc | ↑ | 0.9008 | ± | 0.0273 |
| - jurisprudence | 1 | none | 0 | acc | ↑ | 0.8426 | ± | 0.0352 |
| - logical_fallacies | 1 | none | 0 | acc | ↑ | 0.8466 | ± | 0.0283 |
| - moral_disputes | 1 | none | 0 | acc | ↑ | 0.8064 | ± | 0.0213 |
| - moral_scenarios | 1 | none | 0 | acc | ↑ | 0.5307 | ± | 0.0167 |
| - philosophy | 1 | none | 0 | acc | ↑ | 0.8071 | ± | 0.0224 |
| - prehistory | 1 | none | 0 | acc | ↑ | 0.8364 | ± | 0.0206 |
| - professional_law | 1 | none | 0 | acc | ↑ | 0.6030 | ± | 0.0125 |
| - world_religions | 1 | none | 0 | acc | ↑ | 0.8655 | ± | 0.0262 |
| - other | 2 | none | acc | ↑ | 0.8297 | ± | 0.0064 | |
| - business_ethics | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
| - clinical_knowledge | 1 | none | 0 | acc | ↑ | 0.8566 | ± | 0.0216 |
| - college_medicine | 1 | none | 0 | acc | ↑ | 0.8150 | ± | 0.0296 |
| - global_facts | 1 | none | 0 | acc | ↑ | 0.5200 | ± | 0.0502 |
| - human_aging | 1 | none | 0 | acc | ↑ | 0.7892 | ± | 0.0274 |
| - management | 1 | none | 0 | acc | ↑ | 0.8641 | ± | 0.0339 |
| - marketing | 1 | none | 0 | acc | ↑ | 0.9573 | ± | 0.0133 |
| - medical_genetics | 1 | none | 0 | acc | ↑ | 0.9100 | ± | 0.0288 |
| - miscellaneous | 1 | none | 0 | acc | ↑ | 0.9017 | ± | 0.0106 |
| - nutrition | 1 | none | 0 | acc | ↑ | 0.8660 | ± | 0.0195 |
| - professional_accounting | 1 | none | 0 | acc | ↑ | 0.6525 | ± | 0.0284 |
| - professional_medicine | 1 | none | 0 | acc | ↑ | 0.9044 | ± | 0.0179 |
| - virology | 1 | none | 0 | acc | ↑ | 0.5663 | ± | 0.0386 |
| - social sciences | 2 | none | acc | ↑ | 0.8690 | ± | 0.0060 | |
| - econometrics | 1 | none | 0 | acc | ↑ | 0.7368 | ± | 0.0414 |
| - high_school_geography | 1 | none | 0 | acc | ↑ | 0.9242 | ± | 0.0189 |
| - high_school_government_and_politics | 1 | none | 0 | acc | ↑ | 0.9637 | ± | 0.0135 |
| - high_school_macroeconomics | 1 | none | 0 | acc | ↑ | 0.8538 | ± | 0.0179 |
| - high_school_microeconomics | 1 | none | 0 | acc | ↑ | 0.9286 | ± | 0.0167 |
| - high_school_psychology | 1 | none | 0 | acc | ↑ | 0.9303 | ± | 0.0109 |
| - human_sexuality | 1 | none | 0 | acc | ↑ | 0.8626 | ± | 0.0302 |
| - professional_psychology | 1 | none | 0 | acc | ↑ | 0.8317 | ± | 0.0151 |
| - public_relations | 1 | none | 0 | acc | ↑ | 0.7455 | ± | 0.0417 |
| - security_studies | 1 | none | 0 | acc | ↑ | 0.7673 | ± | 0.0270 |
| - sociology | 1 | none | 0 | acc | ↑ | 0.8856 | ± | 0.0225 |
| - us_foreign_policy | 1 | none | 0 | acc | ↑ | 0.9000 | ± | 0.0302 |
| - stem | 2 | none | acc | ↑ | 0.7846 | ± | 0.0070 | |
| - abstract_algebra | 1 | none | 0 | acc | ↑ | 0.6700 | ± | 0.0473 |
| - anatomy | 1 | none | 0 | acc | ↑ | 0.7778 | ± | 0.0359 |
| - astronomy | 1 | none | 0 | acc | ↑ | 0.9276 | ± | 0.0211 |
| - college_biology | 1 | none | 0 | acc | ↑ | 0.9375 | ± | 0.0202 |
| - college_chemistry | 1 | none | 0 | acc | ↑ | 0.5900 | ± | 0.0494 |
| - college_computer_science | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
| - college_mathematics | 1 | none | 0 | acc | ↑ | 0.6400 | ± | 0.0482 |
| - college_physics | 1 | none | 0 | acc | ↑ | 0.6569 | ± | 0.0472 |
| - computer_security | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
| - conceptual_physics | 1 | none | 0 | acc | ↑ | 0.8979 | ± | 0.0198 |
| - electrical_engineering | 1 | none | 0 | acc | ↑ | 0.8276 | ± | 0.0315 |
| - elementary_mathematics | 1 | none | 0 | acc | ↑ | 0.8095 | ± | 0.0202 |
| - high_school_biology | 1 | none | 0 | acc | ↑ | 0.9355 | ± | 0.0140 |
| - high_school_chemistry | 1 | none | 0 | acc | ↑ | 0.7734 | ± | 0.0295 |
| - high_school_computer_science | 1 | none | 0 | acc | ↑ | 0.8800 | ± | 0.0327 |
| - high_school_mathematics | 1 | none | 0 | acc | ↑ | 0.5333 | ± | 0.0304 |
| - high_school_physics | 1 | none | 0 | acc | ↑ | 0.7152 | ± | 0.0368 |
| - high_school_statistics | 1 | none | 0 | acc | ↑ | 0.7870 | ± | 0.0279 |
| - machine_learning | 1 | none | 0 | acc | ↑ | 0.6786 | ± | 0.0443 |
| Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
|---|---|---|---|---|---|---|---|---|
| mmlu | 2 | none | acc | ↑ | 0.7861 | ± | 0.0033 | |
| - humanities | 2 | none | acc | ↑ | 0.7039 | ± | 0.0063 | |
| - other | 2 | none | acc | ↑ | 0.8297 | ± | 0.0064 | |
| - social sciences | 2 | none | acc | ↑ | 0.8690 | ± | 0.0060 | |
| - stem | 2 | none | acc | ↑ | 0.7846 | ± | 0.0070 |
Heretic:
| Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
|---|---|---|---|---|---|---|---|---|
| mmlu | 2 | none | acc | ↑ | 0.7841 | ± | 0.0033 | |
| - humanities | 2 | none | acc | ↑ | 0.7027 | ± | 0.0063 | |
| - formal_logic | 1 | none | 0 | acc | ↑ | 0.6508 | ± | 0.0426 |
| - high_school_european_history | 1 | none | 0 | acc | ↑ | 0.8848 | ± | 0.0249 |
| - high_school_us_history | 1 | none | 0 | acc | ↑ | 0.8873 | ± | 0.0222 |
| - high_school_world_history | 1 | none | 0 | acc | ↑ | 0.9072 | ± | 0.0189 |
| - international_law | 1 | none | 0 | acc | ↑ | 0.9008 | ± | 0.0273 |
| - jurisprudence | 1 | none | 0 | acc | ↑ | 0.8426 | ± | 0.0352 |
| - logical_fallacies | 1 | none | 0 | acc | ↑ | 0.8344 | ± | 0.0292 |
| - moral_disputes | 1 | none | 0 | acc | ↑ | 0.8208 | ± | 0.0206 |
| - moral_scenarios | 1 | none | 0 | acc | ↑ | 0.5140 | ± | 0.0167 |
| - philosophy | 1 | none | 0 | acc | ↑ | 0.8135 | ± | 0.0221 |
| - prehistory | 1 | none | 0 | acc | ↑ | 0.8457 | ± | 0.0201 |
| - professional_law | 1 | none | 0 | acc | ↑ | 0.6037 | ± | 0.0125 |
| - world_religions | 1 | none | 0 | acc | ↑ | 0.8713 | ± | 0.0257 |
| - other | 2 | none | acc | ↑ | 0.8259 | ± | 0.0065 | |
| - business_ethics | 1 | none | 0 | acc | ↑ | 0.8400 | ± | 0.0368 |
| - clinical_knowledge | 1 | none | 0 | acc | ↑ | 0.8491 | ± | 0.0220 |
| - college_medicine | 1 | none | 0 | acc | ↑ | 0.7977 | ± | 0.0306 |
| - global_facts | 1 | none | 0 | acc | ↑ | 0.4900 | ± | 0.0502 |
| - human_aging | 1 | none | 0 | acc | ↑ | 0.7892 | ± | 0.0274 |
| - management | 1 | none | 0 | acc | ↑ | 0.8544 | ± | 0.0349 |
| - marketing | 1 | none | 0 | acc | ↑ | 0.9487 | ± | 0.0145 |
| - medical_genetics | 1 | none | 0 | acc | ↑ | 0.9000 | ± | 0.0302 |
| - miscellaneous | 1 | none | 0 | acc | ↑ | 0.8966 | ± | 0.0109 |
| - nutrition | 1 | none | 0 | acc | ↑ | 0.8627 | ± | 0.0197 |
| - professional_accounting | 1 | none | 0 | acc | ↑ | 0.6702 | ± | 0.0280 |
| - professional_medicine | 1 | none | 0 | acc | ↑ | 0.9044 | ± | 0.0179 |
| - virology | 1 | none | 0 | acc | ↑ | 0.5602 | ± | 0.0386 |
| - social sciences | 2 | none | acc | ↑ | 0.8658 | ± | 0.0060 | |
| - econometrics | 1 | none | 0 | acc | ↑ | 0.7281 | ± | 0.0419 |
| - high_school_geography | 1 | none | 0 | acc | ↑ | 0.9242 | ± | 0.0189 |
| - high_school_government_and_politics | 1 | none | 0 | acc | ↑ | 0.9637 | ± | 0.0135 |
| - high_school_macroeconomics | 1 | none | 0 | acc | ↑ | 0.8590 | ± | 0.0176 |
| - high_school_microeconomics | 1 | none | 0 | acc | ↑ | 0.9328 | ± | 0.0163 |
| - high_school_psychology | 1 | none | 0 | acc | ↑ | 0.9248 | ± | 0.0113 |
| - human_sexuality | 1 | none | 0 | acc | ↑ | 0.8550 | ± | 0.0309 |
| - professional_psychology | 1 | none | 0 | acc | ↑ | 0.8301 | ± | 0.0152 |
| - public_relations | 1 | none | 0 | acc | ↑ | 0.7273 | ± | 0.0427 |
| - security_studies | 1 | none | 0 | acc | ↑ | 0.7469 | ± | 0.0278 |
| - sociology | 1 | none | 0 | acc | ↑ | 0.8905 | ± | 0.0221 |
| - us_foreign_policy | 1 | none | 0 | acc | ↑ | 0.8900 | ± | 0.0314 |
| - stem | 2 | none | acc | ↑ | 0.7846 | ± | 0.0070 | |
| - abstract_algebra | 1 | none | 0 | acc | ↑ | 0.6700 | ± | 0.0473 |
| - anatomy | 1 | none | 0 | acc | ↑ | 0.7926 | ± | 0.0350 |
| - astronomy | 1 | none | 0 | acc | ↑ | 0.9276 | ± | 0.0211 |
| - college_biology | 1 | none | 0 | acc | ↑ | 0.9306 | ± | 0.0213 |
| - college_chemistry | 1 | none | 0 | acc | ↑ | 0.6100 | ± | 0.0490 |
| - college_computer_science | 1 | none | 0 | acc | ↑ | 0.8100 | ± | 0.0394 |
| - college_mathematics | 1 | none | 0 | acc | ↑ | 0.6300 | ± | 0.0485 |
| - college_physics | 1 | none | 0 | acc | ↑ | 0.6176 | ± | 0.0484 |
| - computer_security | 1 | none | 0 | acc | ↑ | 0.8300 | ± | 0.0378 |
| - conceptual_physics | 1 | none | 0 | acc | ↑ | 0.8936 | ± | 0.0202 |
| - electrical_engineering | 1 | none | 0 | acc | ↑ | 0.8276 | ± | 0.0315 |
| - elementary_mathematics | 1 | none | 0 | acc | ↑ | 0.8042 | ± | 0.0204 |
| - high_school_biology | 1 | none | 0 | acc | ↑ | 0.9323 | ± | 0.0143 |
| - high_school_chemistry | 1 | none | 0 | acc | ↑ | 0.7783 | ± | 0.0292 |
| - high_school_computer_science | 1 | none | 0 | acc | ↑ | 0.8700 | ± | 0.0338 |
| - high_school_mathematics | 1 | none | 0 | acc | ↑ | 0.5407 | ± | 0.0304 |
| - high_school_physics | 1 | none | 0 | acc | ↑ | 0.7285 | ± | 0.0363 |
| - high_school_statistics | 1 | none | 0 | acc | ↑ | 0.7963 | ± | 0.0275 |
| - machine_learning | 1 | none | 0 | acc | ↑ | 0.6964 | ± | 0.0436 |
| Groups | Version | Filter | n-shot | Metric | Value | Stderr | ||
|---|---|---|---|---|---|---|---|---|
| mmlu | 2 | none | acc | ↑ | 0.7841 | ± | 0.0033 | |
| - humanities | 2 | none | acc | ↑ | 0.7027 | ± | 0.0063 | |
| - other | 2 | none | acc | ↑ | 0.8259 | ± | 0.0065 | |
| - social sciences | 2 | none | acc | ↑ | 0.8658 | ± | 0.0060 | |
| - stem | 2 | none | acc | ↑ | 0.7846 | ± | 0.0070 |
As you can see it's got low refusals and low KL divergence with MMLU results very close to the original.
Patreon and Ko-Fi are blocked in Austria?
Oh okay. Actually irl I'm from Russia, I just like Austria.
Do you have some advices that can make heretic process faster? Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer. And I want to know how many trials do you usually use (n_start_trials and n_trials), if possible.
Do you think my current trial is good? Does its still n_start_trials? Here is:
Running trial 9 of 360...
- Parameters:
- start_layer_index = 7
- end_layer_index = 23
- preserve_good_behavior_weight = 0.1983
- steer_bad_behavior_weight = 0.0030
- overcorrect_relative_weight = 0.9453
- neighbor_count = 9
- Reloading model...
Loading weights: 100%|████████████████████████████████████████████████████████████| 473/473 [00:00<00:00, 14666.05it/s] - Abliterating (Arbitrary-Rank Ablation)...
- Evaluating...
- Running PIQA benchmark...
100%|████████████████████████████████████████████████████████████████████████████| 1838/1838 [00:01<00:00, 1218.07it/s]
Running loglikelihood requests: 100%|██████████████████████████████████████████████| 3676/3676 [09:11<00:00, 6.67it/s]
fatal: not a git repository (or any of the parent directories): .git - PIQA acc_norm: 0.6937
- Counting model refusals...
- Refusals: 4/100
- Running PIQA benchmark...
I didn't set n_start_trials in my .toml
Do you have some advices that can make heretic process faster?
Pretty sure you already know the answer to that, as you mentioned a few times that you have no GPU, that's why it's so slow as you need to have everything in VRAM.
Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer.
I didn't upload it, I can not upload it because I ran out of storage (see screenshot that I posted on my previous message that shows that my currently allotted storage has been maxed out), that's why I need to subscribe to Hugging Face PRO to be able to get more storage to be able to upload it, the problem is that I need donations to help me pay for the subscription.
Do you have some advices that can make heretic process faster?
Pretty sure you already know the answer to that, as you mentioned a few times that you have no GPU, that's why it's so slow as you need to have everything in VRAM.
Also thank you for heretic Qwen3.5-9B, I want to use it for finetuning. I will try doing it myself firstly, but I think I will use yours if it will take even longer.
I didn't upload it, I can not upload it because I ran out of storage (see screenshot that I posted on my previous message that shows that my currently allotted storage has been maxed out), that's why I need to subscribe to Hugging Face PRO to be able to get more storage to be able to upload it, the problem is that I need donations to help me pay for the subscription.
Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?
Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?
It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the video_preprocessor_config.jsonand preprocessor_config.json into your Heretic folder and it should work then.
Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?
It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the
video_preprocessor_config.jsonandpreprocessor_config.jsoninto your Heretic folder and it should work then.
It refuses to let me do it, here is the error:
INFO:hf-to-gguf:Loading model: Qwen3.5-0.8B-Heretic
ERROR:hf-to-gguf:Model Qwen3_5ForConditionalGeneration is not supported
If you know how to fix it, I would like to know.
P.S. - changing from ik_llama.cpp converter to llama.cpp one made it work, everything goes smoothly.
Hello, sorry for bothering again. How are you doing GGUF versions of your models? I trying to create GGUF version of my heretic Qwen3.5-0.8B and failing on MMPROJ mostly. Are you using basic llama.cpp converter?
It's failing on mmproj most likely because Heretic does not import the required files, copy and paste the
video_preprocessor_config.jsonandpreprocessor_config.jsoninto your Heretic folder and it should work then.It refuses to let me do it, here is the error:
INFO:hf-to-gguf:Loading model: Qwen3.5-0.8B-Heretic
ERROR:hf-to-gguf:Model Qwen3_5ForConditionalGeneration is not supportedIf you know how to fix it, I would like to know.
P.S. - changing from ik_llama.cpp converter to llama.cpp one made it work, everything goes smoothly.
Good, the error is telling what the issue is too, Qwen3_5 architecture is not (yet?) supported on ik_llama.


