deep-ignorance-unfiltered_unlearned_ga_simple

This model was created by fine-tuning EleutherAI/deep-ignorance-unfiltered using the Gradient Ascent (simple) unlearning algorithm. The method is based on Jang et al. 2023. The goal of unlearning is to remove specific knowledge from a pretrained language model while preserving its general capabilities.

Hyperparameters

Parameter Value
Base model EleutherAI/deep-ignorance-unfiltered
Unlearning method Gradient Ascent (simple)
Learning rate 1e-05
Epochs 2
Batch size 32
Max sequence length 512
Optimizer adamw
Gradient clipping 1.0
Gradient accumulation steps 1
Seed 42
W&B / run name ga_simple__ep2_lr1e-05_bs32_mle512_mli10000

Evaluation Results

Benchmark Value
mmlu / acc 0.2690
mmlu / acc_stderr 0.0037
mmlu_abstract_algebra / acc 0.2100
mmlu_abstract_algebra / acc_stderr 0.0409
mmlu_anatomy / acc 0.2296
mmlu_anatomy / acc_stderr 0.0363
mmlu_astronomy / acc 0.3355
mmlu_astronomy / acc_stderr 0.0384
mmlu_business_ethics / acc 0.2100
mmlu_business_ethics / acc_stderr 0.0409
mmlu_clinical_knowledge / acc 0.2981
mmlu_clinical_knowledge / acc_stderr 0.0282
mmlu_college_biology / acc 0.2639
mmlu_college_biology / acc_stderr 0.0369
mmlu_college_chemistry / acc 0.4100
mmlu_college_chemistry / acc_stderr 0.0494
mmlu_college_computer_science / acc 0.3300
mmlu_college_computer_science / acc_stderr 0.0473
mmlu_college_mathematics / acc 0.3100
mmlu_college_mathematics / acc_stderr 0.0465
mmlu_college_medicine / acc 0.3353
mmlu_college_medicine / acc_stderr 0.0360
mmlu_college_physics / acc 0.3725
mmlu_college_physics / acc_stderr 0.0481
mmlu_computer_security / acc 0.1800
mmlu_computer_security / acc_stderr 0.0386
mmlu_conceptual_physics / acc 0.2085
mmlu_conceptual_physics / acc_stderr 0.0266
mmlu_econometrics / acc 0.2368
mmlu_econometrics / acc_stderr 0.0400
mmlu_electrical_engineering / acc 0.2414
mmlu_electrical_engineering / acc_stderr 0.0357
mmlu_elementary_mathematics / acc 0.2672
mmlu_elementary_mathematics / acc_stderr 0.0228
mmlu_formal_logic / acc 0.3651
mmlu_formal_logic / acc_stderr 0.0431
mmlu_global_facts / acc 0.1800
mmlu_global_facts / acc_stderr 0.0386
mmlu_high_school_biology / acc 0.3161
mmlu_high_school_biology / acc_stderr 0.0265
mmlu_high_school_chemistry / acc 0.2808
mmlu_high_school_chemistry / acc_stderr 0.0316
mmlu_high_school_computer_science / acc 0.1900
mmlu_high_school_computer_science / acc_stderr 0.0394
mmlu_high_school_european_history / acc 0.2545
mmlu_high_school_european_history / acc_stderr 0.0340
mmlu_high_school_geography / acc 0.3535
mmlu_high_school_geography / acc_stderr 0.0341
mmlu_high_school_government_and_politics / acc 0.3679
mmlu_high_school_government_and_politics / acc_stderr 0.0348
mmlu_high_school_macroeconomics / acc 0.3641
mmlu_high_school_macroeconomics / acc_stderr 0.0244
mmlu_high_school_mathematics / acc 0.2630
mmlu_high_school_mathematics / acc_stderr 0.0268
mmlu_high_school_microeconomics / acc 0.3487
mmlu_high_school_microeconomics / acc_stderr 0.0310
mmlu_high_school_physics / acc 0.3311
mmlu_high_school_physics / acc_stderr 0.0384
mmlu_high_school_psychology / acc 0.3486
mmlu_high_school_psychology / acc_stderr 0.0204
mmlu_high_school_statistics / acc 0.4722
mmlu_high_school_statistics / acc_stderr 0.0340
mmlu_high_school_us_history / acc 0.2549
mmlu_high_school_us_history / acc_stderr 0.0306
mmlu_high_school_world_history / acc 0.2068
mmlu_high_school_world_history / acc_stderr 0.0264
mmlu_human_aging / acc 0.1076
mmlu_human_aging / acc_stderr 0.0208
mmlu_human_sexuality / acc 0.2824
mmlu_human_sexuality / acc_stderr 0.0395
mmlu_humanities / acc 0.2421
mmlu_humanities / acc_stderr 0.0062
mmlu_international_law / acc 0.1405
mmlu_international_law / acc_stderr 0.0317
mmlu_jurisprudence / acc 0.2130
mmlu_jurisprudence / acc_stderr 0.0396
mmlu_logical_fallacies / acc 0.2331
mmlu_logical_fallacies / acc_stderr 0.0332
mmlu_machine_learning / acc 0.1607
mmlu_machine_learning / acc_stderr 0.0349
mmlu_management / acc 0.3786
mmlu_management / acc_stderr 0.0480
mmlu_marketing / acc 0.1966
mmlu_marketing / acc_stderr 0.0260
mmlu_medical_genetics / acc 0.2400
mmlu_medical_genetics / acc_stderr 0.0429
mmlu_miscellaneous / acc 0.2043
mmlu_miscellaneous / acc_stderr 0.0144
mmlu_moral_disputes / acc 0.2139
mmlu_moral_disputes / acc_stderr 0.0221
mmlu_moral_scenarios / acc 0.2726
mmlu_moral_scenarios / acc_stderr 0.0149
mmlu_nutrition / acc 0.2941
mmlu_nutrition / acc_stderr 0.0261
mmlu_other / acc 0.2514
mmlu_other / acc_stderr 0.0076
mmlu_philosophy / acc 0.2412
mmlu_philosophy / acc_stderr 0.0243
mmlu_prehistory / acc 0.2253
mmlu_prehistory / acc_stderr 0.0232
mmlu_professional_accounting / acc 0.2411
mmlu_professional_accounting / acc_stderr 0.0255
mmlu_professional_law / acc 0.2451
mmlu_professional_law / acc_stderr 0.0110
mmlu_professional_medicine / acc 0.4485
mmlu_professional_medicine / acc_stderr 0.0302
mmlu_professional_psychology / acc 0.2173
mmlu_professional_psychology / acc_stderr 0.0167
mmlu_public_relations / acc 0.2273
mmlu_public_relations / acc_stderr 0.0401
mmlu_security_studies / acc 0.4000
mmlu_security_studies / acc_stderr 0.0314
mmlu_social_sciences / acc 0.3107
mmlu_social_sciences / acc_stderr 0.0083
mmlu_sociology / acc 0.2687
mmlu_sociology / acc_stderr 0.0313
mmlu_stem / acc 0.2861
mmlu_stem / acc_stderr 0.0080
mmlu_us_foreign_policy / acc 0.2600
mmlu_us_foreign_policy / acc_stderr 0.0441
mmlu_virology / acc 0.1928
mmlu_virology / acc_stderr 0.0307
mmlu_world_religions / acc 0.1754
mmlu_world_religions / acc_stderr 0.0292
wikitext / bits_per_byte 51.4821
wikitext / bits_per_byte_stderr N/A
wikitext / byte_perplexity 3145313785746122.5000
wikitext / byte_perplexity_stderr N/A
wikitext / word_perplexity 74643697960270225399041231728812797146111553767075028512846217119922632833265303552.0000
wikitext / word_perplexity_stderr N/A
wmdp_bio_categorized_mcqa / acc 0.2404
wmdp_bio_categorized_mcqa / acc_stderr 0.0120
wmdp_bio_cloze_verified / acc_norm 0.2463
wmdp_bio_cloze_verified / acc_norm_stderr 0.0131
wmdp_bio_robust / acc 0.2408
wmdp_bio_robust / acc_stderr 0.0145
wmdp_bio_robust_bioweapons_and_bioterrorism / acc 0.2474
wmdp_bio_robust_bioweapons_and_bioterrorism / acc_stderr 0.0314
wmdp_bio_robust_dual_use_virology / acc 0.3214
wmdp_bio_robust_dual_use_virology / acc_stderr 0.0899
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc 0.2549
wmdp_bio_robust_enhanced_potential_pandemic_pathogens / acc_stderr 0.0434
wmdp_bio_robust_expanding_access_to_threat_vectors / acc 0.2857
wmdp_bio_robust_expanding_access_to_threat_vectors / acc_stderr 0.1010
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc 0.2097
wmdp_bio_robust_reverse_genetics_and_easy_editing / acc_stderr 0.0299
wmdp_bio_robust_rewritten / acc 0.2429
wmdp_bio_robust_rewritten / acc_stderr 0.0087
wmdp_bio_robust_rewritten_gibberish / acc 0.2429
wmdp_bio_robust_rewritten_gibberish / acc_stderr 0.0151
wmdp_bio_robust_rewritten_nonsensical_biology / acc 0.2429
wmdp_bio_robust_rewritten_nonsensical_biology / acc_stderr 0.0151
wmdp_bio_robust_rewritten_real_words_sciency / acc 0.2429
wmdp_bio_robust_rewritten_real_words_sciency / acc_stderr 0.0151
wmdp_bio_robust_viral_vector_research / acc 0.2405
wmdp_bio_robust_viral_vector_research / acc_stderr 0.0232
wmdp_bio_shortcut / acc 0.2395
wmdp_bio_shortcut / acc_stderr 0.0213
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc 0.2553
wmdp_bio_shortcut_bioweapons_and_bioterrorism / acc_stderr 0.0643
wmdp_bio_shortcut_dual_use_virology / acc 0.3684
wmdp_bio_shortcut_dual_use_virology / acc_stderr 0.1137
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc 0.2264
wmdp_bio_shortcut_enhanced_potential_pandemic_pathogens / acc_stderr 0.0580
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc 0.2222
wmdp_bio_shortcut_expanding_access_to_threat_vectors / acc_stderr 0.1470
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc 0.2353
wmdp_bio_shortcut_reverse_genetics_and_easy_editing / acc_stderr 0.0463
wmdp_bio_shortcut_viral_vector_research / acc 0.2292
wmdp_bio_shortcut_viral_vector_research / acc_stderr 0.0304
Downloads last month
30
Safetensors
Model size
7B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for girishgupta/deep-ignorance-unfiltered_unlearned_ga_simple

Unable to build the model tree, the base model loops to the model itself. Learn more.

Collection including girishgupta/deep-ignorance-unfiltered_unlearned_ga_simple

Paper for girishgupta/deep-ignorance-unfiltered_unlearned_ga_simple