File size: 10,152 Bytes
8b2b5ea
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|                           Tasks                            |Version|Filter|n-shot|    Metric     |   |    Value    |   |Stderr|
|------------------------------------------------------------|------:|------|-----:|---------------|---|------------:|---|------|
|blimp                                                       |      2|none  |     0|acc            |↑  |       0.5177|±  |0.0017|
| - blimp_adjunct_island                                     |      1|none  |     0|acc            |↑  |       0.7430|±  |0.0138|
| - blimp_anaphor_gender_agreement                           |      1|none  |     0|acc            |↑  |       0.2600|±  |0.0139|
| - blimp_anaphor_number_agreement                           |      1|none  |     0|acc            |↑  |       0.4650|±  |0.0158|
| - blimp_animate_subject_passive                            |      1|none  |     0|acc            |↑  |       0.5740|±  |0.0156|
| - blimp_animate_subject_trans                              |      1|none  |     0|acc            |↑  |       0.6820|±  |0.0147|
| - blimp_causative                                          |      1|none  |     0|acc            |↑  |       0.4270|±  |0.0156|
| - blimp_complex_NP_island                                  |      1|none  |     0|acc            |↑  |       0.4380|±  |0.0157|
| - blimp_coordinate_structure_constraint_complex_left_branch|      1|none  |     0|acc            |↑  |       0.0860|±  |0.0089|
| - blimp_coordinate_structure_constraint_object_extraction  |      1|none  |     0|acc            |↑  |       0.5060|±  |0.0158|
| - blimp_determiner_noun_agreement_1                        |      1|none  |     0|acc            |↑  |       0.5960|±  |0.0155|
| - blimp_determiner_noun_agreement_2                        |      1|none  |     0|acc            |↑  |       0.5470|±  |0.0157|
| - blimp_determiner_noun_agreement_irregular_1              |      1|none  |     0|acc            |↑  |       0.5110|±  |0.0158|
| - blimp_determiner_noun_agreement_irregular_2              |      1|none  |     0|acc            |↑  |       0.5840|±  |0.0156|
| - blimp_determiner_noun_agreement_with_adj_2               |      1|none  |     0|acc            |↑  |       0.4880|±  |0.0158|
| - blimp_determiner_noun_agreement_with_adj_irregular_1     |      1|none  |     0|acc            |↑  |       0.4500|±  |0.0157|
| - blimp_determiner_noun_agreement_with_adj_irregular_2     |      1|none  |     0|acc            |↑  |       0.5310|±  |0.0158|
| - blimp_determiner_noun_agreement_with_adjective_1         |      1|none  |     0|acc            |↑  |       0.5190|±  |0.0158|
| - blimp_distractor_agreement_relational_noun               |      1|none  |     0|acc            |↑  |       0.3480|±  |0.0151|
| - blimp_distractor_agreement_relative_clause               |      1|none  |     0|acc            |↑  |       0.3440|±  |0.0150|
| - blimp_drop_argument                                      |      1|none  |     0|acc            |↑  |       0.7320|±  |0.0140|
| - blimp_ellipsis_n_bar_1                                   |      1|none  |     0|acc            |↑  |       0.2240|±  |0.0132|
| - blimp_ellipsis_n_bar_2                                   |      1|none  |     0|acc            |↑  |       0.2920|±  |0.0144|
| - blimp_existential_there_object_raising                   |      1|none  |     0|acc            |↑  |       0.7300|±  |0.0140|
| - blimp_existential_there_quantifiers_1                    |      1|none  |     0|acc            |↑  |       0.7110|±  |0.0143|
| - blimp_existential_there_quantifiers_2                    |      1|none  |     0|acc            |↑  |       0.0400|±  |0.0062|
| - blimp_existential_there_subject_raising                  |      1|none  |     0|acc            |↑  |       0.6460|±  |0.0151|
| - blimp_expletive_it_object_raising                        |      1|none  |     0|acc            |↑  |       0.6440|±  |0.0151|
| - blimp_inchoative                                         |      1|none  |     0|acc            |↑  |       0.3790|±  |0.0153|
| - blimp_intransitive                                       |      1|none  |     0|acc            |↑  |       0.5630|±  |0.0157|
| - blimp_irregular_past_participle_adjectives               |      1|none  |     0|acc            |↑  |       0.4000|±  |0.0155|
| - blimp_irregular_past_participle_verbs                    |      1|none  |     0|acc            |↑  |       0.5430|±  |0.0158|
| - blimp_irregular_plural_subject_verb_agreement_1          |      1|none  |     0|acc            |↑  |       0.4460|±  |0.0157|
| - blimp_irregular_plural_subject_verb_agreement_2          |      1|none  |     0|acc            |↑  |       0.5100|±  |0.0158|
| - blimp_left_branch_island_echo_question                   |      1|none  |     0|acc            |↑  |       0.8390|±  |0.0116|
| - blimp_left_branch_island_simple_question                 |      1|none  |     0|acc            |↑  |       0.1170|±  |0.0102|
| - blimp_matrix_question_npi_licensor_present               |      1|none  |     0|acc            |↑  |       0.0020|±  |0.0014|
| - blimp_npi_present_1                                      |      1|none  |     0|acc            |↑  |       0.5060|±  |0.0158|
| - blimp_npi_present_2                                      |      1|none  |     0|acc            |↑  |       0.5070|±  |0.0158|
| - blimp_only_npi_licensor_present                          |      1|none  |     0|acc            |↑  |       0.1620|±  |0.0117|
| - blimp_only_npi_scope                                     |      1|none  |     0|acc            |↑  |       0.0930|±  |0.0092|
| - blimp_passive_1                                          |      1|none  |     0|acc            |↑  |       0.5950|±  |0.0155|
| - blimp_passive_2                                          |      1|none  |     0|acc            |↑  |       0.6130|±  |0.0154|
| - blimp_principle_A_c_command                              |      1|none  |     0|acc            |↑  |       0.5840|±  |0.0156|
| - blimp_principle_A_case_1                                 |      1|none  |     0|acc            |↑  |       0.9990|±  |0.0010|
| - blimp_principle_A_case_2                                 |      1|none  |     0|acc            |↑  |       0.4280|±  |0.0157|
| - blimp_principle_A_domain_1                               |      1|none  |     0|acc            |↑  |       1.0000|±  |     0|
| - blimp_principle_A_domain_2                               |      1|none  |     0|acc            |↑  |       0.6010|±  |0.0155|
| - blimp_principle_A_domain_3                               |      1|none  |     0|acc            |↑  |       0.5150|±  |0.0158|
| - blimp_principle_A_reconstruction                         |      1|none  |     0|acc            |↑  |       0.1900|±  |0.0124|
| - blimp_regular_plural_subject_verb_agreement_1            |      1|none  |     0|acc            |↑  |       0.6880|±  |0.0147|
| - blimp_regular_plural_subject_verb_agreement_2            |      1|none  |     0|acc            |↑  |       0.5920|±  |0.0155|
| - blimp_sentential_negation_npi_licensor_present           |      1|none  |     0|acc            |↑  |       0.9990|±  |0.0010|
| - blimp_sentential_negation_npi_scope                      |      1|none  |     0|acc            |↑  |       0.5420|±  |0.0158|
| - blimp_sentential_subject_island                          |      1|none  |     0|acc            |↑  |       0.3570|±  |0.0152|
| - blimp_superlative_quantifiers_1                          |      1|none  |     0|acc            |↑  |       0.4970|±  |0.0158|
| - blimp_superlative_quantifiers_2                          |      1|none  |     0|acc            |↑  |       0.6980|±  |0.0145|
| - blimp_tough_vs_raising_1                                 |      1|none  |     0|acc            |↑  |       0.2810|±  |0.0142|
| - blimp_tough_vs_raising_2                                 |      1|none  |     0|acc            |↑  |       0.7660|±  |0.0134|
| - blimp_transitive                                         |      1|none  |     0|acc            |↑  |       0.6110|±  |0.0154|
| - blimp_wh_island                                          |      1|none  |     0|acc            |↑  |       0.2680|±  |0.0140|
| - blimp_wh_questions_object_gap                            |      1|none  |     0|acc            |↑  |       0.7850|±  |0.0130|
| - blimp_wh_questions_subject_gap                           |      1|none  |     0|acc            |↑  |       0.9600|±  |0.0062|
| - blimp_wh_questions_subject_gap_long_distance             |      1|none  |     0|acc            |↑  |       0.9490|±  |0.0070|
| - blimp_wh_vs_that_no_gap                                  |      1|none  |     0|acc            |↑  |       0.9830|±  |0.0041|
| - blimp_wh_vs_that_no_gap_long_distance                    |      1|none  |     0|acc            |↑  |       0.9770|±  |0.0047|
| - blimp_wh_vs_that_with_gap                                |      1|none  |     0|acc            |↑  |       0.0070|±  |0.0026|
| - blimp_wh_vs_that_with_gap_long_distance                  |      1|none  |     0|acc            |↑  |       0.0190|±  |0.0043|
|arc_easy                                                    |      1|none  |     0|acc            |↑  |       0.2639|±  |0.0090|
|                                                            |       |none  |     0|acc_norm       |↑  |       0.2731|±  |0.0091|
|wikitext                                                    |      2|none  |     0|bits_per_byte  |↓  |       4.6536|±  |   N/A|
|                                                            |       |none  |     0|byte_perplexity|↓  |      25.1691|±  |   N/A|
|                                                            |       |none  |     0|word_perplexity|↓  |30979484.4095|±  |   N/A|

|Groups|Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------|------:|------|-----:|------|---|-----:|---|-----:|
|blimp |      2|none  |     0|acc   |↑  |0.5177|±  |0.0017|