deep-ignorance-unfiltered_unlearned_wt_dist

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Weight Distortion unlearning algorithm. The method is based on Siddiqui et al. 2025. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter Value
Base model EleutherAI/deep-ignorance-unfiltered
Unlearning method Weight Distortion
Learning rate 3e-05
Epochs 3
Batch size 32
Max sequence length 2048
Optimizer adamw
Gradient clipping 1.0
Gradient accumulation steps 1
Seed 42
W&B / run name wt_dist__ep3_lr3e-05_bs32_wn0.01_mle2048_mli1024
Noise std 0.01

Evaluation Results

Benchmark Value
mmlu / acc 0.3310
mmlu / acc_stderr 0.0039
mmlu_abstract_algebra / acc 0.3100
mmlu_abstract_algebra / acc_stderr 0.0465
mmlu_anatomy / acc 0.4148
mmlu_anatomy / acc_stderr 0.0426
mmlu_astronomy / acc 0.3684
mmlu_astronomy / acc_stderr 0.0393
mmlu_business_ethics / acc 0.3300
mmlu_business_ethics / acc_stderr 0.0473
mmlu_clinical_knowledge / acc 0.3283
mmlu_clinical_knowledge / acc_stderr 0.0289
mmlu_college_biology / acc 0.3472
mmlu_college_biology / acc_stderr 0.0398
mmlu_college_chemistry / acc 0.2400
mmlu_college_chemistry / acc_stderr 0.0429
mmlu_college_computer_science / acc 0.2900
mmlu_college_computer_science / acc_stderr 0.0456
mmlu_college_mathematics / acc 0.2400
mmlu_college_mathematics / acc_stderr 0.0429
mmlu_college_medicine / acc 0.3064
mmlu_college_medicine / acc_stderr 0.0351
mmlu_college_physics / acc 0.2059
mmlu_college_physics / acc_stderr 0.0402
mmlu_computer_security / acc 0.4900
mmlu_computer_security / acc_stderr 0.0502
mmlu_conceptual_physics / acc 0.2511
mmlu_conceptual_physics / acc_stderr 0.0283
mmlu_econometrics / acc 0.2193
mmlu_econometrics / acc_stderr 0.0389
mmlu_electrical_engineering / acc 0.3448
mmlu_electrical_engineering / acc_stderr 0.0396
mmlu_elementary_mathematics / acc 0.2646
mmlu_elementary_mathematics / acc_stderr 0.0227
mmlu_formal_logic / acc 0.1746
mmlu_formal_logic / acc_stderr 0.0340
mmlu_global_facts / acc 0.3500
mmlu_global_facts / acc_stderr 0.0479
mmlu_high_school_biology / acc 0.3419
mmlu_high_school_biology / acc_stderr 0.0270
mmlu_high_school_chemistry / acc 0.2709
mmlu_high_school_chemistry / acc_stderr 0.0313
mmlu_high_school_computer_science / acc 0.3900
mmlu_high_school_computer_science / acc_stderr 0.0490
mmlu_high_school_european_history / acc 0.3455
mmlu_high_school_european_history / acc_stderr 0.0371
mmlu_high_school_geography / acc 0.3384
mmlu_high_school_geography / acc_stderr 0.0337
mmlu_high_school_government_and_politics / acc 0.3886
mmlu_high_school_government_and_politics / acc_stderr 0.0352
mmlu_high_school_macroeconomics / acc 0.2462
mmlu_high_school_macroeconomics / acc_stderr 0.0218
mmlu_high_school_mathematics / acc 0.2556
mmlu_high_school_mathematics / acc_stderr 0.0266
mmlu_high_school_microeconomics / acc 0.2437
mmlu_high_school_microeconomics / acc_stderr 0.0279
mmlu_high_school_physics / acc 0.2649
mmlu_high_school_physics / acc_stderr 0.0360
mmlu_high_school_psychology / acc 0.3963
mmlu_high_school_psychology / acc_stderr 0.0210
mmlu_high_school_statistics / acc 0.2454
mmlu_high_school_statistics / acc_stderr 0.0293
mmlu_high_school_us_history / acc 0.3284
mmlu_high_school_us_history / acc_stderr 0.0330
mmlu_high_school_world_history / acc 0.3544
mmlu_high_school_world_history / acc_stderr 0.0311
mmlu_human_aging / acc 0.3857
mmlu_human_aging / acc_stderr 0.0327
mmlu_human_sexuality / acc 0.4122
mmlu_human_sexuality / acc_stderr 0.0432
mmlu_humanities / acc 0.3216
mmlu_humanities / acc_stderr 0.0067
mmlu_international_law / acc 0.5207
mmlu_international_law / acc_stderr 0.0456
mmlu_jurisprudence / acc 0.3796
mmlu_jurisprudence / acc_stderr 0.0469
mmlu_logical_fallacies / acc 0.3620
mmlu_logical_fallacies / acc_stderr 0.0378
mmlu_machine_learning / acc 0.2679
mmlu_machine_learning / acc_stderr 0.0420
mmlu_management / acc 0.4272
mmlu_management / acc_stderr 0.0490
mmlu_marketing / acc 0.4274
mmlu_marketing / acc_stderr 0.0324
mmlu_medical_genetics / acc 0.3300
mmlu_medical_genetics / acc_stderr 0.0473
mmlu_miscellaneous / acc 0.4419
mmlu_miscellaneous / acc_stderr 0.0178
mmlu_moral_disputes / acc 0.3902
mmlu_moral_disputes / acc_stderr 0.0263
mmlu_moral_scenarios / acc 0.2380
mmlu_moral_scenarios / acc_stderr 0.0142
mmlu_nutrition / acc 0.3497
mmlu_nutrition / acc_stderr 0.0273
mmlu_other / acc 0.3614
mmlu_other / acc_stderr 0.0085
mmlu_philosophy / acc 0.4148
mmlu_philosophy / acc_stderr 0.0280
mmlu_prehistory / acc 0.3642
mmlu_prehistory / acc_stderr 0.0268
mmlu_professional_accounting / acc 0.2979
mmlu_professional_accounting / acc_stderr 0.0273
mmlu_professional_law / acc 0.2940
mmlu_professional_law / acc_stderr 0.0116
mmlu_professional_medicine / acc 0.2206
mmlu_professional_medicine / acc_stderr 0.0252
mmlu_professional_psychology / acc 0.3497
mmlu_professional_psychology / acc_stderr 0.0193
mmlu_public_relations / acc 0.3636
mmlu_public_relations / acc_stderr 0.0461
mmlu_security_studies / acc 0.2735
mmlu_security_studies / acc_stderr 0.0285
mmlu_social_sciences / acc 0.3481
mmlu_social_sciences / acc_stderr 0.0085
mmlu_sociology / acc 0.5274
mmlu_sociology / acc_stderr 0.0353
mmlu_stem / acc 0.2984
mmlu_stem / acc_stderr 0.0081
mmlu_us_foreign_policy / acc 0.5300
mmlu_us_foreign_policy / acc_stderr 0.0502
mmlu_virology / acc 0.3313
mmlu_virology / acc_stderr 0.0366
mmlu_world_religions / acc 0.4327
mmlu_world_religions / acc_stderr 0.0380
wikitext / bits_per_byte 0.7535
wikitext / bits_per_byte_stderr N/A
wikitext / byte_perplexity 1.6859
wikitext / byte_perplexity_stderr N/A
wikitext / word_perplexity 16.3301
wikitext / word_perplexity_stderr N/A
wmdp_bio_categorized_mcqa / acc 0.3786
wmdp_bio_categorized_mcqa / acc_stderr 0.0134
wmdp_bio_cloze_verified / acc_norm 0.3086
wmdp_bio_cloze_verified / acc_norm_stderr 0.0141
wmdp_bio_robust / acc 0.3249
wmdp_bio_robust / acc_stderr 0.0159
wmdp_bio_robust_bioweapons_and_bioterrorism / acc 0.2579
wmdp_bio_robust_bioweapons_and_bioterrorism / acc_stderr 0.0318
wmdp_bio_robust_dual_use_virology / acc 0.2857
wmdp_bio_robust_dual_use_virology / acc_stderr 0.0869
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc 0.3824
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc_stderr 0.0484
wmdp_bio_robust_expanding_access_to_threat_vectors / acc 0.3333
wmdp_bio_robust_expanding_access_to_threat_vectors / acc_stderr 0.1054
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc 0.3871
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc_stderr 0.0358
wmdp_bio_robust_rewritten / acc 0.2622
wmdp_bio_robust_rewritten / acc_stderr 0.0089
wmdp_bio_robust_rewritten_gibberish / acc 0.2602
wmdp_bio_robust_rewritten_gibberish / acc_stderr 0.0154
wmdp_bio_robust_rewritten_nonsensical_biology / acc 0.2602
wmdp_bio_robust_rewritten_nonsensical_biology / acc_stderr 0.0154
wmdp_bio_robust_rewritten_real_words_sciency / acc 0.2663
wmdp_bio_robust_rewritten_real_words_sciency / acc_stderr 0.0155
wmdp_bio_robust_viral_vector_research / acc 0.3138
wmdp_bio_robust_viral_vector_research / acc_stderr 0.0252
wmdp_bio_shortcut / acc 0.4938
wmdp_bio_shortcut / acc_stderr 0.0249
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc 0.4255
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc_stderr 0.0729
wmdp_bio_shortcut_dual_use_virology / acc 0.4211
wmdp_bio_shortcut_dual_use_virology / acc_stderr 0.1164
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc 0.4717
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc_stderr 0.0692
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc 0.6667
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc_stderr 0.1667
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc 0.5176
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc_stderr 0.0545
wmdp_bio_shortcut_viral_vector_research / acc 0.5052
wmdp_bio_shortcut_viral_vector_research / acc_stderr 0.0362
Downloads last month
27
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist