GeoV
/

GeoV-9b

@@ -43,6 +43,27 @@ The released weights were trained on ~70 billion tokens.
 We plan to continue training up to 300 billion tokens and update the weights at every 20b tokens.
 This training run is monolingual and uses c4en and english wikipedia datasets.
 ## Installation
 ```shell

 We plan to continue training up to 300 billion tokens and update the weights at every 20b tokens.
 This training run is monolingual and uses c4en and english wikipedia datasets.
+## Test results
+These are the results from [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) at 80B (tokens trained) checkpoint.
+|     Task     |Version| Metric |Value |   |Stderr|
+|--------------|------:|--------|-----:|---|-----:|
+|anli_r1       |      0|acc     |0.3150|±  |0.0147|
+|anli_r2       |      0|acc     |0.3380|±  |0.0150|
+|anli_r3       |      0|acc     |0.3367|±  |0.0136|
+|hellaswag     |      0|acc     |0.4761|±  |0.0050|
+|              |       |acc_norm|0.6308|±  |0.0048|
+|lambada_openai|      0|ppl     |8.9700|±  |0.2606|
+|              |       |acc     |0.5628|±  |0.0069|
+|mathqa        |      0|acc     |0.2318|±  |0.0077|
+|              |       |acc_norm|0.2372|±  |0.0078|
+|piqa          |      0|acc     |0.7448|±  |0.0102|
+|              |       |acc_norm|0.7639|±  |0.0099|
+|winogrande    |      0|acc     |0.5935|±  |0.0138|
+|wsc           |      0|acc     |0.4038|±  |0.0483|
 ## Installation
 ```shell