Commit ·
425baeb
1
Parent(s): ed6854b
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,13 +1,15 @@
|
|
| 1 |
---
|
| 2 |
-
language:
|
| 3 |
-
|
|
|
|
|
|
|
| 4 |
tags:
|
| 5 |
- kenlm
|
| 6 |
- perplexity
|
| 7 |
- n-gram
|
| 8 |
- kneser-ney
|
| 9 |
- bigscience
|
| 10 |
-
license:
|
| 11 |
datasets:
|
| 12 |
- wikipedia
|
| 13 |
---
|
|
@@ -42,4 +44,4 @@ model.get_perplexity("I am very perplexed")
|
|
| 42 |
model.get_perplexity("im hella trippin")
|
| 43 |
# 46793.5 (high perplexity, since the sentence is colloquial and contains grammar mistakes)
|
| 44 |
```
|
| 45 |
-
In the example above we see that, since Wikipedia is a collection of encyclopedic articles, a KenLM model trained on it will naturally give lower perplexity scores to sentences with formal language and no grammar mistakes than colloquial sentences with grammar mistakes.
|
|
|
|
| 1 |
---
|
| 2 |
+
language:
|
| 3 |
+
- ja
|
| 4 |
+
- de
|
| 5 |
+
- ru
|
| 6 |
tags:
|
| 7 |
- kenlm
|
| 8 |
- perplexity
|
| 9 |
- n-gram
|
| 10 |
- kneser-ney
|
| 11 |
- bigscience
|
| 12 |
+
license: mit
|
| 13 |
datasets:
|
| 14 |
- wikipedia
|
| 15 |
---
|
|
|
|
| 44 |
model.get_perplexity("im hella trippin")
|
| 45 |
# 46793.5 (high perplexity, since the sentence is colloquial and contains grammar mistakes)
|
| 46 |
```
|
| 47 |
+
In the example above we see that, since Wikipedia is a collection of encyclopedic articles, a KenLM model trained on it will naturally give lower perplexity scores to sentences with formal language and no grammar mistakes than colloquial sentences with grammar mistakes.
|