deep-ignorance-unfiltered_unlearned_ga

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Gradient Ascent unlearning algorithm. The method is based on Jang et al. 2023. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter Value
Base model EleutherAI/deep-ignorance-unfiltered
Unlearning method Gradient Ascent
Learning rate 3e-05
Epochs 4
Batch size 32
Max sequence length 512
Optimizer adamw
Gradient clipping 1.0
Gradient accumulation steps 1
Seed 42
W&B / run name ga__ep4_lr3e-05_bs32_rw1.0_mle512_mli10000
Retain weight 1.0

Evaluation Results

Benchmark Value
mmlu / acc 0.3932
mmlu / acc_stderr 0.0040
mmlu_abstract_algebra / acc 0.2900
mmlu_abstract_algebra / acc_stderr 0.0456
mmlu_anatomy / acc 0.4000
mmlu_anatomy / acc_stderr 0.0423
mmlu_astronomy / acc 0.4671
mmlu_astronomy / acc_stderr 0.0406
mmlu_business_ethics / acc 0.3200
mmlu_business_ethics / acc_stderr 0.0469
mmlu_clinical_knowledge / acc 0.4151
mmlu_clinical_knowledge / acc_stderr 0.0303
mmlu_college_biology / acc 0.4167
mmlu_college_biology / acc_stderr 0.0412
mmlu_college_chemistry / acc 0.3300
mmlu_college_chemistry / acc_stderr 0.0473
mmlu_college_computer_science / acc 0.3900
mmlu_college_computer_science / acc_stderr 0.0490
mmlu_college_mathematics / acc 0.3000
mmlu_college_mathematics / acc_stderr 0.0461
mmlu_college_medicine / acc 0.4104
mmlu_college_medicine / acc_stderr 0.0375
mmlu_college_physics / acc 0.2745
mmlu_college_physics / acc_stderr 0.0444
mmlu_computer_security / acc 0.5200
mmlu_computer_security / acc_stderr 0.0502
mmlu_conceptual_physics / acc 0.3277
mmlu_conceptual_physics / acc_stderr 0.0307
mmlu_econometrics / acc 0.2544
mmlu_econometrics / acc_stderr 0.0410
mmlu_electrical_engineering / acc 0.4069
mmlu_electrical_engineering / acc_stderr 0.0409
mmlu_elementary_mathematics / acc 0.2831
mmlu_elementary_mathematics / acc_stderr 0.0232
mmlu_formal_logic / acc 0.1905
mmlu_formal_logic / acc_stderr 0.0351
mmlu_global_facts / acc 0.3000
mmlu_global_facts / acc_stderr 0.0461
mmlu_high_school_biology / acc 0.3968
mmlu_high_school_biology / acc_stderr 0.0278
mmlu_high_school_chemistry / acc 0.3596
mmlu_high_school_chemistry / acc_stderr 0.0338
mmlu_high_school_computer_science / acc 0.4200
mmlu_high_school_computer_science / acc_stderr 0.0496
mmlu_high_school_european_history / acc 0.4788
mmlu_high_school_european_history / acc_stderr 0.0390
mmlu_high_school_geography / acc 0.4697
mmlu_high_school_geography / acc_stderr 0.0356
mmlu_high_school_government_and_politics / acc 0.5440
mmlu_high_school_government_and_politics / acc_stderr 0.0359
mmlu_high_school_macroeconomics / acc 0.3667
mmlu_high_school_macroeconomics / acc_stderr 0.0244
mmlu_high_school_mathematics / acc 0.2556
mmlu_high_school_mathematics / acc_stderr 0.0266
mmlu_high_school_microeconomics / acc 0.3908
mmlu_high_school_microeconomics / acc_stderr 0.0317
mmlu_high_school_physics / acc 0.2914
mmlu_high_school_physics / acc_stderr 0.0371
mmlu_high_school_psychology / acc 0.5596
mmlu_high_school_psychology / acc_stderr 0.0213
mmlu_high_school_statistics / acc 0.3102
mmlu_high_school_statistics / acc_stderr 0.0315
mmlu_high_school_us_history / acc 0.4706
mmlu_high_school_us_history / acc_stderr 0.0350
mmlu_high_school_world_history / acc 0.4895
mmlu_high_school_world_history / acc_stderr 0.0325
mmlu_human_aging / acc 0.4260
mmlu_human_aging / acc_stderr 0.0332
mmlu_human_sexuality / acc 0.5420
mmlu_human_sexuality / acc_stderr 0.0437
mmlu_humanities / acc 0.3624
mmlu_humanities / acc_stderr 0.0069
mmlu_international_law / acc 0.4876
mmlu_international_law / acc_stderr 0.0456
mmlu_jurisprudence / acc 0.4630
mmlu_jurisprudence / acc_stderr 0.0482
mmlu_logical_fallacies / acc 0.4172
mmlu_logical_fallacies / acc_stderr 0.0387
mmlu_machine_learning / acc 0.2589
mmlu_machine_learning / acc_stderr 0.0416
mmlu_management / acc 0.5534
mmlu_management / acc_stderr 0.0492
mmlu_marketing / acc 0.5769
mmlu_marketing / acc_stderr 0.0324
mmlu_medical_genetics / acc 0.3800
mmlu_medical_genetics / acc_stderr 0.0488
mmlu_miscellaneous / acc 0.5492
mmlu_miscellaneous / acc_stderr 0.0178
mmlu_moral_disputes / acc 0.3873
mmlu_moral_disputes / acc_stderr 0.0262
mmlu_moral_scenarios / acc 0.2682
mmlu_moral_scenarios / acc_stderr 0.0148
mmlu_nutrition / acc 0.3954
mmlu_nutrition / acc_stderr 0.0280
mmlu_other / acc 0.4258
mmlu_other / acc_stderr 0.0087
mmlu_philosophy / acc 0.4598
mmlu_philosophy / acc_stderr 0.0283
mmlu_prehistory / acc 0.4259
mmlu_prehistory / acc_stderr 0.0275
mmlu_professional_accounting / acc 0.3191
mmlu_professional_accounting / acc_stderr 0.0278
mmlu_professional_law / acc 0.2953
mmlu_professional_law / acc_stderr 0.0117
mmlu_professional_medicine / acc 0.3051
mmlu_professional_medicine / acc_stderr 0.0280
mmlu_professional_psychology / acc 0.3856
mmlu_professional_psychology / acc_stderr 0.0197
mmlu_public_relations / acc 0.4273
mmlu_public_relations / acc_stderr 0.0474
mmlu_security_studies / acc 0.3918
mmlu_security_studies / acc_stderr 0.0313
mmlu_social_sciences / acc 0.4573
mmlu_social_sciences / acc_stderr 0.0088
mmlu_sociology / acc 0.6169
mmlu_sociology / acc_stderr 0.0344
mmlu_stem / acc 0.3444
mmlu_stem / acc_stderr 0.0084
mmlu_us_foreign_policy / acc 0.6500
mmlu_us_foreign_policy / acc_stderr 0.0479
mmlu_virology / acc 0.1867
mmlu_virology / acc_stderr 0.0303
mmlu_world_religions / acc 0.6140
mmlu_world_religions / acc_stderr 0.0373
wikitext / bits_per_byte 0.7084
wikitext / bits_per_byte_stderr N/A
wikitext / byte_perplexity 1.6340
wikitext / byte_perplexity_stderr N/A
wikitext / word_perplexity 13.8135
wikitext / word_perplexity_stderr N/A
wmdp_bio_categorized_mcqa / acc 0.2569
wmdp_bio_categorized_mcqa / acc_stderr 0.0122
wmdp_bio_cloze_verified / acc_norm 0.2500
wmdp_bio_cloze_verified / acc_norm_stderr 0.0132
wmdp_bio_robust / acc 0.2569
wmdp_bio_robust / acc_stderr 0.0148
wmdp_bio_robust_bioweapons_and_bioterrorism / acc 0.2789
wmdp_bio_robust_bioweapons_and_bioterrorism / acc_stderr 0.0326
wmdp_bio_robust_dual_use_virology / acc 0.3929
wmdp_bio_robust_dual_use_virology / acc_stderr 0.0940
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc 0.2451
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc_stderr 0.0428
wmdp_bio_robust_expanding_access_to_threat_vectors / acc 0.3333
wmdp_bio_robust_expanding_access_to_threat_vectors / acc_stderr 0.1054
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc 0.2258
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc_stderr 0.0307
wmdp_bio_robust_rewritten / acc 0.2630
wmdp_bio_robust_rewritten / acc_stderr 0.0089
wmdp_bio_robust_rewritten_gibberish / acc 0.2626
wmdp_bio_robust_rewritten_gibberish / acc_stderr 0.0155
wmdp_bio_robust_rewritten_nonsensical_biology / acc 0.2577
wmdp_bio_robust_rewritten_nonsensical_biology / acc_stderr 0.0154
wmdp_bio_robust_rewritten_real_words_sciency / acc 0.2688
wmdp_bio_robust_rewritten_real_words_sciency / acc_stderr 0.0156
wmdp_bio_robust_viral_vector_research / acc 0.2493
wmdp_bio_robust_viral_vector_research / acc_stderr 0.0235
wmdp_bio_shortcut / acc 0.2568
wmdp_bio_shortcut / acc_stderr 0.0215
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc 0.4468
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc_stderr 0.0733
wmdp_bio_shortcut_dual_use_virology / acc 0.3158
wmdp_bio_shortcut_dual_use_virology / acc_stderr 0.1096
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc 0.1509
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc_stderr 0.0496
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc 0.2222
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc_stderr 0.1470
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc 0.2353
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc_stderr 0.0463
wmdp_bio_shortcut_viral_vector_research / acc 0.2448
wmdp_bio_shortcut_viral_vector_research / acc_stderr 0.0311
Downloads last month
5
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_ga

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including girishgupta/deep-ignorance-unfiltered_unlearned_ga

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_ga