File size: 9,852 Bytes
aedb965
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
|                           Tasks                            |Version|Filter|n-shot|    Metric     |   |  Value  |   |Stderr|
|------------------------------------------------------------|------:|------|-----:|---------------|---|--------:|---|------|
|arc_easy                                                    |      1|none  |     0|acc            |↑  |   0.2727|±  |0.0091|
|                                                            |       |none  |     0|acc_norm       |↑  |   0.2816|±  |0.0092|
|blimp                                                       |      2|none  |      |acc            |↑  |   0.5526|±  |0.0017|
| - blimp_adjunct_island                                     |      1|none  |     0|acc            |↑  |   0.7330|±  |0.0140|
| - blimp_anaphor_gender_agreement                           |      1|none  |     0|acc            |↑  |   0.3820|±  |0.0154|
| - blimp_anaphor_number_agreement                           |      1|none  |     0|acc            |↑  |   0.5030|±  |0.0158|
| - blimp_animate_subject_passive                            |      1|none  |     0|acc            |↑  |   0.5520|±  |0.0157|
| - blimp_animate_subject_trans                              |      1|none  |     0|acc            |↑  |   0.7250|±  |0.0141|
| - blimp_causative                                          |      1|none  |     0|acc            |↑  |   0.5010|±  |0.0158|
| - blimp_complex_NP_island                                  |      1|none  |     0|acc            |↑  |   0.5640|±  |0.0157|
| - blimp_coordinate_structure_constraint_complex_left_branch|      1|none  |     0|acc            |↑  |   0.0840|±  |0.0088|
| - blimp_coordinate_structure_constraint_object_extraction  |      1|none  |     0|acc            |↑  |   0.4930|±  |0.0158|
| - blimp_determiner_noun_agreement_1                        |      1|none  |     0|acc            |↑  |   0.7000|±  |0.0145|
| - blimp_determiner_noun_agreement_2                        |      1|none  |     0|acc            |↑  |   0.7070|±  |0.0144|
| - blimp_determiner_noun_agreement_irregular_1              |      1|none  |     0|acc            |↑  |   0.5500|±  |0.0157|
| - blimp_determiner_noun_agreement_irregular_2              |      1|none  |     0|acc            |↑  |   0.7110|±  |0.0143|
| - blimp_determiner_noun_agreement_with_adj_2               |      1|none  |     0|acc            |↑  |   0.6170|±  |0.0154|
| - blimp_determiner_noun_agreement_with_adj_irregular_1     |      1|none  |     0|acc            |↑  |   0.5010|±  |0.0158|
| - blimp_determiner_noun_agreement_with_adj_irregular_2     |      1|none  |     0|acc            |↑  |   0.6180|±  |0.0154|
| - blimp_determiner_noun_agreement_with_adjective_1         |      1|none  |     0|acc            |↑  |   0.6380|±  |0.0152|
| - blimp_distractor_agreement_relational_noun               |      1|none  |     0|acc            |↑  |   0.3050|±  |0.0146|
| - blimp_distractor_agreement_relative_clause               |      1|none  |     0|acc            |↑  |   0.2710|±  |0.0141|
| - blimp_drop_argument                                      |      1|none  |     0|acc            |↑  |   0.6970|±  |0.0145|
| - blimp_ellipsis_n_bar_1                                   |      1|none  |     0|acc            |↑  |   0.2640|±  |0.0139|
| - blimp_ellipsis_n_bar_2                                   |      1|none  |     0|acc            |↑  |   0.4140|±  |0.0156|
| - blimp_existential_there_object_raising                   |      1|none  |     0|acc            |↑  |   0.7440|±  |0.0138|
| - blimp_existential_there_quantifiers_1                    |      1|none  |     0|acc            |↑  |   0.9030|±  |0.0094|
| - blimp_existential_there_quantifiers_2                    |      1|none  |     0|acc            |↑  |   0.1200|±  |0.0103|
| - blimp_existential_there_subject_raising                  |      1|none  |     0|acc            |↑  |   0.6530|±  |0.0151|
| - blimp_expletive_it_object_raising                        |      1|none  |     0|acc            |↑  |   0.6850|±  |0.0147|
| - blimp_inchoative                                         |      1|none  |     0|acc            |↑  |   0.4090|±  |0.0156|
| - blimp_intransitive                                       |      1|none  |     0|acc            |↑  |   0.5600|±  |0.0157|
| - blimp_irregular_past_participle_adjectives               |      1|none  |     0|acc            |↑  |   0.7220|±  |0.0142|
| - blimp_irregular_past_participle_verbs                    |      1|none  |     0|acc            |↑  |   0.6330|±  |0.0152|
| - blimp_irregular_plural_subject_verb_agreement_1          |      1|none  |     0|acc            |↑  |   0.6140|±  |0.0154|
| - blimp_irregular_plural_subject_verb_agreement_2          |      1|none  |     0|acc            |↑  |   0.7250|±  |0.0141|
| - blimp_left_branch_island_echo_question                   |      1|none  |     0|acc            |↑  |   0.6450|±  |0.0151|
| - blimp_left_branch_island_simple_question                 |      1|none  |     0|acc            |↑  |   0.1690|±  |0.0119|
| - blimp_matrix_question_npi_licensor_present               |      1|none  |     0|acc            |↑  |   0.0020|±  |0.0014|
| - blimp_npi_present_1                                      |      1|none  |     0|acc            |↑  |   0.3860|±  |0.0154|
| - blimp_npi_present_2                                      |      1|none  |     0|acc            |↑  |   0.3810|±  |0.0154|
| - blimp_only_npi_licensor_present                          |      1|none  |     0|acc            |↑  |   0.6120|±  |0.0154|
| - blimp_only_npi_scope                                     |      1|none  |     0|acc            |↑  |   0.4280|±  |0.0157|
| - blimp_passive_1                                          |      1|none  |     0|acc            |↑  |   0.6450|±  |0.0151|
| - blimp_passive_2                                          |      1|none  |     0|acc            |↑  |   0.6410|±  |0.0152|
| - blimp_principle_A_c_command                              |      1|none  |     0|acc            |↑  |   0.6910|±  |0.0146|
| - blimp_principle_A_case_1                                 |      1|none  |     0|acc            |↑  |   1.0000|±  |     0|
| - blimp_principle_A_case_2                                 |      1|none  |     0|acc            |↑  |   0.5190|±  |0.0158|
| - blimp_principle_A_domain_1                               |      1|none  |     0|acc            |↑  |   0.9810|±  |0.0043|
| - blimp_principle_A_domain_2                               |      1|none  |     0|acc            |↑  |   0.5570|±  |0.0157|
| - blimp_principle_A_domain_3                               |      1|none  |     0|acc            |↑  |   0.4680|±  |0.0158|
| - blimp_principle_A_reconstruction                         |      1|none  |     0|acc            |↑  |   0.2410|±  |0.0135|
| - blimp_regular_plural_subject_verb_agreement_1            |      1|none  |     0|acc            |↑  |   0.7200|±  |0.0142|
| - blimp_regular_plural_subject_verb_agreement_2            |      1|none  |     0|acc            |↑  |   0.6030|±  |0.0155|
| - blimp_sentential_negation_npi_licensor_present           |      1|none  |     0|acc            |↑  |   1.0000|±  |     0|
| - blimp_sentential_negation_npi_scope                      |      1|none  |     0|acc            |↑  |   0.4990|±  |0.0158|
| - blimp_sentential_subject_island                          |      1|none  |     0|acc            |↑  |   0.3440|±  |0.0150|
| - blimp_superlative_quantifiers_1                          |      1|none  |     0|acc            |↑  |   0.5400|±  |0.0158|
| - blimp_superlative_quantifiers_2                          |      1|none  |     0|acc            |↑  |   0.1780|±  |0.0121|
| - blimp_tough_vs_raising_1                                 |      1|none  |     0|acc            |↑  |   0.4330|±  |0.0157|
| - blimp_tough_vs_raising_2                                 |      1|none  |     0|acc            |↑  |   0.5950|±  |0.0155|
| - blimp_transitive                                         |      1|none  |     0|acc            |↑  |   0.6260|±  |0.0153|
| - blimp_wh_island                                          |      1|none  |     0|acc            |↑  |   0.4180|±  |0.0156|
| - blimp_wh_questions_object_gap                            |      1|none  |     0|acc            |↑  |   0.5430|±  |0.0158|
| - blimp_wh_questions_subject_gap                           |      1|none  |     0|acc            |↑  |   0.9160|±  |0.0088|
| - blimp_wh_questions_subject_gap_long_distance             |      1|none  |     0|acc            |↑  |   0.9410|±  |0.0075|
| - blimp_wh_vs_that_no_gap                                  |      1|none  |     0|acc            |↑  |   0.9800|±  |0.0044|
| - blimp_wh_vs_that_no_gap_long_distance                    |      1|none  |     0|acc            |↑  |   0.9820|±  |0.0042|
| - blimp_wh_vs_that_with_gap                                |      1|none  |     0|acc            |↑  |   0.0280|±  |0.0052|
| - blimp_wh_vs_that_with_gap_long_distance                  |      1|none  |     0|acc            |↑  |   0.0150|±  |0.0038|
|wikitext                                                    |      2|none  |     0|bits_per_byte  |↓  |   2.1661|±  |   N/A|
|                                                            |       |none  |     0|byte_perplexity|↓  |   4.4881|±  |   N/A|
|                                                            |       |none  |     0|word_perplexity|↓  |3068.2023|±  |   N/A|

|Groups|Version|Filter|n-shot|Metric|   |Value |   |Stderr|
|------|------:|------|------|------|---|-----:|---|-----:|
|blimp |      2|none  |      |acc   |↑  |0.5526|±  |0.0017|