File size: 9,777 Bytes
d467b07
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|                           Tasks                            |Version|Filter|n-shot|    Metric     |   | Value  |   |Stderr|
|------------------------------------------------------------|------:|------|-----:|---------------|---|-------:|---|------|
|arc_easy                                                    |      1|none  |     0|acc            |↑  |  0.3439|±  |0.0097|
|                                                            |       |none  |     0|acc_norm       |↑  |  0.3346|±  |0.0097|
|blimp                                                       |      2|none  |      |acc            |↑  |  0.6349|±  |0.0016|
| - blimp_adjunct_island                                     |      1|none  |     0|acc            |↑  |  0.7990|±  |0.0127|
| - blimp_anaphor_gender_agreement                           |      1|none  |     0|acc            |↑  |  0.3290|±  |0.0149|
| - blimp_anaphor_number_agreement                           |      1|none  |     0|acc            |↑  |  0.6330|±  |0.0152|
| - blimp_animate_subject_passive                            |      1|none  |     0|acc            |↑  |  0.6550|±  |0.0150|
| - blimp_animate_subject_trans                              |      1|none  |     0|acc            |↑  |  0.8180|±  |0.0122|
| - blimp_causative                                          |      1|none  |     0|acc            |↑  |  0.4900|±  |0.0158|
| - blimp_complex_NP_island                                  |      1|none  |     0|acc            |↑  |  0.5300|±  |0.0158|
| - blimp_coordinate_structure_constraint_complex_left_branch|      1|none  |     0|acc            |↑  |  0.2990|±  |0.0145|
| - blimp_coordinate_structure_constraint_object_extraction  |      1|none  |     0|acc            |↑  |  0.7550|±  |0.0136|
| - blimp_determiner_noun_agreement_1                        |      1|none  |     0|acc            |↑  |  0.7910|±  |0.0129|
| - blimp_determiner_noun_agreement_2                        |      1|none  |     0|acc            |↑  |  0.8640|±  |0.0108|
| - blimp_determiner_noun_agreement_irregular_1              |      1|none  |     0|acc            |↑  |  0.7020|±  |0.0145|
| - blimp_determiner_noun_agreement_irregular_2              |      1|none  |     0|acc            |↑  |  0.8460|±  |0.0114|
| - blimp_determiner_noun_agreement_with_adj_2               |      1|none  |     0|acc            |↑  |  0.7370|±  |0.0139|
| - blimp_determiner_noun_agreement_with_adj_irregular_1     |      1|none  |     0|acc            |↑  |  0.5780|±  |0.0156|
| - blimp_determiner_noun_agreement_with_adj_irregular_2     |      1|none  |     0|acc            |↑  |  0.7300|±  |0.0140|
| - blimp_determiner_noun_agreement_with_adjective_1         |      1|none  |     0|acc            |↑  |  0.7060|±  |0.0144|
| - blimp_distractor_agreement_relational_noun               |      1|none  |     0|acc            |↑  |  0.2630|±  |0.0139|
| - blimp_distractor_agreement_relative_clause               |      1|none  |     0|acc            |↑  |  0.2060|±  |0.0128|
| - blimp_drop_argument                                      |      1|none  |     0|acc            |↑  |  0.7110|±  |0.0143|
| - blimp_ellipsis_n_bar_1                                   |      1|none  |     0|acc            |↑  |  0.5800|±  |0.0156|
| - blimp_ellipsis_n_bar_2                                   |      1|none  |     0|acc            |↑  |  0.7490|±  |0.0137|
| - blimp_existential_there_object_raising                   |      1|none  |     0|acc            |↑  |  0.7470|±  |0.0138|
| - blimp_existential_there_quantifiers_1                    |      1|none  |     0|acc            |↑  |  0.8450|±  |0.0115|
| - blimp_existential_there_quantifiers_2                    |      1|none  |     0|acc            |↑  |  0.2720|±  |0.0141|
| - blimp_existential_there_subject_raising                  |      1|none  |     0|acc            |↑  |  0.6560|±  |0.0150|
| - blimp_expletive_it_object_raising                        |      1|none  |     0|acc            |↑  |  0.6820|±  |0.0147|
| - blimp_inchoative                                         |      1|none  |     0|acc            |↑  |  0.4210|±  |0.0156|
| - blimp_intransitive                                       |      1|none  |     0|acc            |↑  |  0.5750|±  |0.0156|
| - blimp_irregular_past_participle_adjectives               |      1|none  |     0|acc            |↑  |  0.9240|±  |0.0084|
| - blimp_irregular_past_participle_verbs                    |      1|none  |     0|acc            |↑  |  0.6800|±  |0.0148|
| - blimp_irregular_plural_subject_verb_agreement_1          |      1|none  |     0|acc            |↑  |  0.7100|±  |0.0144|
| - blimp_irregular_plural_subject_verb_agreement_2          |      1|none  |     0|acc            |↑  |  0.8520|±  |0.0112|
| - blimp_left_branch_island_echo_question                   |      1|none  |     0|acc            |↑  |  0.8390|±  |0.0116|
| - blimp_left_branch_island_simple_question                 |      1|none  |     0|acc            |↑  |  0.3810|±  |0.0154|
| - blimp_matrix_question_npi_licensor_present               |      1|none  |     0|acc            |↑  |  0.0060|±  |0.0024|
| - blimp_npi_present_1                                      |      1|none  |     0|acc            |↑  |  0.5420|±  |0.0158|
| - blimp_npi_present_2                                      |      1|none  |     0|acc            |↑  |  0.5250|±  |0.0158|
| - blimp_only_npi_licensor_present                          |      1|none  |     0|acc            |↑  |  0.3710|±  |0.0153|
| - blimp_only_npi_scope                                     |      1|none  |     0|acc            |↑  |  0.4090|±  |0.0156|
| - blimp_passive_1                                          |      1|none  |     0|acc            |↑  |  0.7980|±  |0.0127|
| - blimp_passive_2                                          |      1|none  |     0|acc            |↑  |  0.7770|±  |0.0132|
| - blimp_principle_A_c_command                              |      1|none  |     0|acc            |↑  |  0.6410|±  |0.0152|
| - blimp_principle_A_case_1                                 |      1|none  |     0|acc            |↑  |  1.0000|±  |     0|
| - blimp_principle_A_case_2                                 |      1|none  |     0|acc            |↑  |  0.7200|±  |0.0142|
| - blimp_principle_A_domain_1                               |      1|none  |     0|acc            |↑  |  0.7350|±  |0.0140|
| - blimp_principle_A_domain_2                               |      1|none  |     0|acc            |↑  |  0.6190|±  |0.0154|
| - blimp_principle_A_domain_3                               |      1|none  |     0|acc            |↑  |  0.5460|±  |0.0158|
| - blimp_principle_A_reconstruction                         |      1|none  |     0|acc            |↑  |  0.4780|±  |0.0158|
| - blimp_regular_plural_subject_verb_agreement_1            |      1|none  |     0|acc            |↑  |  0.7920|±  |0.0128|
| - blimp_regular_plural_subject_verb_agreement_2            |      1|none  |     0|acc            |↑  |  0.7970|±  |0.0127|
| - blimp_sentential_negation_npi_licensor_present           |      1|none  |     0|acc            |↑  |  1.0000|±  |     0|
| - blimp_sentential_negation_npi_scope                      |      1|none  |     0|acc            |↑  |  0.6700|±  |0.0149|
| - blimp_sentential_subject_island                          |      1|none  |     0|acc            |↑  |  0.4350|±  |0.0157|
| - blimp_superlative_quantifiers_1                          |      1|none  |     0|acc            |↑  |  0.9270|±  |0.0082|
| - blimp_superlative_quantifiers_2                          |      1|none  |     0|acc            |↑  |  0.5460|±  |0.0158|
| - blimp_tough_vs_raising_1                                 |      1|none  |     0|acc            |↑  |  0.4410|±  |0.0157|
| - blimp_tough_vs_raising_2                                 |      1|none  |     0|acc            |↑  |  0.6850|±  |0.0147|
| - blimp_transitive                                         |      1|none  |     0|acc            |↑  |  0.7490|±  |0.0137|
| - blimp_wh_island                                          |      1|none  |     0|acc            |↑  |  0.4360|±  |0.0157|
| - blimp_wh_questions_object_gap                            |      1|none  |     0|acc            |↑  |  0.5880|±  |0.0156|
| - blimp_wh_questions_subject_gap                           |      1|none  |     0|acc            |↑  |  0.8800|±  |0.0103|
| - blimp_wh_questions_subject_gap_long_distance             |      1|none  |     0|acc            |↑  |  0.9460|±  |0.0072|
| - blimp_wh_vs_that_no_gap                                  |      1|none  |     0|acc            |↑  |  0.9810|±  |0.0043|
| - blimp_wh_vs_that_no_gap_long_distance                    |      1|none  |     0|acc            |↑  |  0.9880|±  |0.0034|
| - blimp_wh_vs_that_with_gap                                |      1|none  |     0|acc            |↑  |  0.1180|±  |0.0102|
| - blimp_wh_vs_that_with_gap_long_distance                  |      1|none  |     0|acc            |↑  |  0.0340|±  |0.0057|
|wikitext                                                    |      2|none  |     0|bits_per_byte  |↓  |  1.4123|±  |   N/A|
|                                                            |       |none  |     0|byte_perplexity|↓  |  2.6617|±  |   N/A|
|                                                            |       |none  |     0|word_perplexity|↓  |187.7215|±  |   N/A|

|Groups|Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------|------:|------|------|------|---|-----:|---|-----:|
|blimp |      2|none  |      |acc   |↑  |0.6349|±  |0.0016|