muthuk1's picture
Add corpus.jsonl dataset via git-lfs (478 Wikipedia science docs, 2.5M tokens)
44da0e3
This file is stored with Xet . It is too big to display, but you can still download it.

Xet Pointer Details

( Raw pointer file )
Xet hash:
fa7fc3aa067352ab61a9f4fcd79c63826d7c73b0cfecbbac537917fbaefea4df
Size of remote file:
11.7 MB
·
SHA256:
244d71eeafcc7368aa40aa15dde631b378dc51a4e6b20447cd6f307d92577235

Xet efficiently stores Large Files inside Git, intelligently splitting files into unique chunks and accelerating uploads and downloads. More info.