deep-ignorance-unfiltered_unlearned_wt_dist_reg

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Weight Distortion (Regularized) unlearning algorithm. The method is based on Siddiqui et al. 2025. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter Value
Base model EleutherAI/deep-ignorance-unfiltered
Unlearning method Weight Distortion (Regularized)
Learning rate 3e-05
Epochs 3
Batch size 32
Max sequence length 512
Optimizer adamw
Gradient clipping 1.0
Gradient accumulation steps 1
Seed 42
W&B / run name wt_dist_reg__ep3_lr3e-05_bs32_wr1.0_mle512_mli10000
Regularizer lambda 1.0

Evaluation Results

Benchmark Value
mmlu / acc 0.2597
mmlu / acc_stderr 0.0037
mmlu_abstract_algebra / acc 0.1900
mmlu_abstract_algebra / acc_stderr 0.0394
mmlu_anatomy / acc 0.2074
mmlu_anatomy / acc_stderr 0.0350
mmlu_astronomy / acc 0.3289
mmlu_astronomy / acc_stderr 0.0382
mmlu_business_ethics / acc 0.2200
mmlu_business_ethics / acc_stderr 0.0416
mmlu_clinical_knowledge / acc 0.2906
mmlu_clinical_knowledge / acc_stderr 0.0279
mmlu_college_biology / acc 0.2500
mmlu_college_biology / acc_stderr 0.0362
mmlu_college_chemistry / acc 0.3700
mmlu_college_chemistry / acc_stderr 0.0485
mmlu_college_computer_science / acc 0.3000
mmlu_college_computer_science / acc_stderr 0.0461
mmlu_college_mathematics / acc 0.2200
mmlu_college_mathematics / acc_stderr 0.0416
mmlu_college_medicine / acc 0.3295
mmlu_college_medicine / acc_stderr 0.0358
mmlu_college_physics / acc 0.2745
mmlu_college_physics / acc_stderr 0.0444
mmlu_computer_security / acc 0.2200
mmlu_computer_security / acc_stderr 0.0416
mmlu_conceptual_physics / acc 0.2000
mmlu_conceptual_physics / acc_stderr 0.0261
mmlu_econometrics / acc 0.2281
mmlu_econometrics / acc_stderr 0.0395
mmlu_electrical_engineering / acc 0.2897
mmlu_electrical_engineering / acc_stderr 0.0378
mmlu_elementary_mathematics / acc 0.2407
mmlu_elementary_mathematics / acc_stderr 0.0220
mmlu_formal_logic / acc 0.3889
mmlu_formal_logic / acc_stderr 0.0436
mmlu_global_facts / acc 0.2000
mmlu_global_facts / acc_stderr 0.0402
mmlu_high_school_biology / acc 0.2806
mmlu_high_school_biology / acc_stderr 0.0256
mmlu_high_school_chemistry / acc 0.2906
mmlu_high_school_chemistry / acc_stderr 0.0319
mmlu_high_school_computer_science / acc 0.2500
mmlu_high_school_computer_science / acc_stderr 0.0435
mmlu_high_school_european_history / acc 0.2485
mmlu_high_school_european_history / acc_stderr 0.0337
mmlu_high_school_geography / acc 0.2374
mmlu_high_school_geography / acc_stderr 0.0303
mmlu_high_school_government_and_politics / acc 0.3368
mmlu_high_school_government_and_politics / acc_stderr 0.0341
mmlu_high_school_macroeconomics / acc 0.2974
mmlu_high_school_macroeconomics / acc_stderr 0.0232
mmlu_high_school_mathematics / acc 0.2370
mmlu_high_school_mathematics / acc_stderr 0.0259
mmlu_high_school_microeconomics / acc 0.2437
mmlu_high_school_microeconomics / acc_stderr 0.0279
mmlu_high_school_physics / acc 0.2583
mmlu_high_school_physics / acc_stderr 0.0357
mmlu_high_school_psychology / acc 0.3431
mmlu_high_school_psychology / acc_stderr 0.0204
mmlu_high_school_statistics / acc 0.3380
mmlu_high_school_statistics / acc_stderr 0.0323
mmlu_high_school_us_history / acc 0.2990
mmlu_high_school_us_history / acc_stderr 0.0321
mmlu_high_school_world_history / acc 0.2068
mmlu_high_school_world_history / acc_stderr 0.0264
mmlu_human_aging / acc 0.1031
mmlu_human_aging / acc_stderr 0.0204
mmlu_human_sexuality / acc 0.2824
mmlu_human_sexuality / acc_stderr 0.0395
mmlu_humanities / acc 0.2508
mmlu_humanities / acc_stderr 0.0063
mmlu_international_law / acc 0.1488
mmlu_international_law / acc_stderr 0.0325
mmlu_jurisprudence / acc 0.2593
mmlu_jurisprudence / acc_stderr 0.0424
mmlu_logical_fallacies / acc 0.2209
mmlu_logical_fallacies / acc_stderr 0.0326
mmlu_machine_learning / acc 0.2232
mmlu_machine_learning / acc_stderr 0.0395
mmlu_management / acc 0.3495
mmlu_management / acc_stderr 0.0472
mmlu_marketing / acc 0.2521
mmlu_marketing / acc_stderr 0.0284
mmlu_medical_genetics / acc 0.2700
mmlu_medical_genetics / acc_stderr 0.0446
mmlu_miscellaneous / acc 0.2299
mmlu_miscellaneous / acc_stderr 0.0150
mmlu_moral_disputes / acc 0.2312
mmlu_moral_disputes / acc_stderr 0.0227
mmlu_moral_scenarios / acc 0.2737
mmlu_moral_scenarios / acc_stderr 0.0149
mmlu_nutrition / acc 0.2810
mmlu_nutrition / acc_stderr 0.0257
mmlu_other / acc 0.2559
mmlu_other / acc_stderr 0.0077
mmlu_philosophy / acc 0.2444
mmlu_philosophy / acc_stderr 0.0244
mmlu_prehistory / acc 0.2037
mmlu_prehistory / acc_stderr 0.0224
mmlu_professional_accounting / acc 0.2199
mmlu_professional_accounting / acc_stderr 0.0247
mmlu_professional_law / acc 0.2555
mmlu_professional_law / acc_stderr 0.0111
mmlu_professional_medicine / acc 0.4007
mmlu_professional_medicine / acc_stderr 0.0298
mmlu_professional_psychology / acc 0.2222
mmlu_professional_psychology / acc_stderr 0.0168
mmlu_public_relations / acc 0.2364
mmlu_public_relations / acc_stderr 0.0407
mmlu_security_studies / acc 0.3102
mmlu_security_studies / acc_stderr 0.0296
mmlu_social_sciences / acc 0.2756
mmlu_social_sciences / acc_stderr 0.0080
mmlu_sociology / acc 0.2338
mmlu_sociology / acc_stderr 0.0299
mmlu_stem / acc 0.2613
mmlu_stem / acc_stderr 0.0078
mmlu_us_foreign_policy / acc 0.2700
mmlu_us_foreign_policy / acc_stderr 0.0446
mmlu_virology / acc 0.2229
mmlu_virology / acc_stderr 0.0324
mmlu_world_religions / acc 0.2281
mmlu_world_religions / acc_stderr 0.0322
wikitext / bits_per_byte 1.7813
wikitext / bits_per_byte_stderr N/A
wikitext / byte_perplexity 3.4373
wikitext / byte_perplexity_stderr N/A
wikitext / word_perplexity 736.9357
wikitext / word_perplexity_stderr N/A
wmdp_bio_categorized_mcqa / acc 0.2270
wmdp_bio_categorized_mcqa / acc_stderr 0.0118
wmdp_bio_cloze_verified / acc_norm 0.2184
wmdp_bio_cloze_verified / acc_norm_stderr 0.0126
wmdp_bio_robust / acc 0.2350
wmdp_bio_robust / acc_stderr 0.0144
wmdp_bio_robust_bioweapons_and_bioterrorism / acc 0.2368
wmdp_bio_robust_bioweapons_and_bioterrorism / acc_stderr 0.0309
wmdp_bio_robust_dual_use_virology / acc 0.2500
wmdp_bio_robust_dual_use_virology / acc_stderr 0.0833
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc 0.2353
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc_stderr 0.0422
wmdp_bio_robust_expanding_access_to_threat_vectors / acc 0.1905
wmdp_bio_robust_expanding_access_to_threat_vectors / acc_stderr 0.0878
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc 0.1882
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc_stderr 0.0287
wmdp_bio_robust_rewritten / acc 0.2351
wmdp_bio_robust_rewritten / acc_stderr 0.0086
wmdp_bio_robust_rewritten_gibberish / acc 0.2256
wmdp_bio_robust_rewritten_gibberish / acc_stderr 0.0147
wmdp_bio_robust_rewritten_nonsensical_biology / acc 0.2232
wmdp_bio_robust_rewritten_nonsensical_biology / acc_stderr 0.0146
wmdp_bio_robust_rewritten_real_words_sciency / acc 0.2565
wmdp_bio_robust_rewritten_real_words_sciency / acc_stderr 0.0153
wmdp_bio_robust_viral_vector_research / acc 0.2610
wmdp_bio_robust_viral_vector_research / acc_stderr 0.0238
wmdp_bio_shortcut / acc 0.2099
wmdp_bio_shortcut / acc_stderr 0.0204
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc 0.2128
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc_stderr 0.0603
wmdp_bio_shortcut_dual_use_virology / acc 0.2105
wmdp_bio_shortcut_dual_use_virology / acc_stderr 0.0961
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc 0.1887
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc_stderr 0.0543
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc 0.1111
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc_stderr 0.1111
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc 0.2471
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc_stderr 0.0471
wmdp_bio_shortcut_viral_vector_research / acc 0.2031
wmdp_bio_shortcut_viral_vector_research / acc_stderr 0.0291
Downloads last month
27
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist_reg

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist_reg

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_wt_dist_reg