Harley-ml commited on
Commit
1a2f7a2
·
verified ·
1 Parent(s): c802d89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -497,7 +497,7 @@ Tenete-8M was trained on an **RTX 2060 6GB** for one epoch with a batch size of
497
 
498
  ### Dataset
499
 
500
- The dataset encompasses **577M tokens**, and includes **3 sources**:
501
 
502
  1. **Textbooks**: Web data is too noisy, so we decided to use Tiny-Textbooks, a synthetic dataset generated by
503
  2. **Medium Articles**: While web data, especially medium articles, is noisy, we still need human-written examples
 
497
 
498
  ### Dataset
499
 
500
+ The dataset encompasses **577M tokens**, and includes **4 sources**:
501
 
502
  1. **Textbooks**: Web data is too noisy, so we decided to use Tiny-Textbooks, a synthetic dataset generated by
503
  2. **Medium Articles**: While web data, especially medium articles, is noisy, we still need human-written examples