Viewer
• Updated • 487k • 68
• 2
Note This dataset is mostly crawled from Sinhala Lankadeepa news papers.
Pamzyy/translated_dataset
Viewer
• Updated • 153 • 2
Note This dataset is translated by me
ihalage/sinhala-finetune-qa-eli5
Viewer
• Updated • 10k • 3
• 2
Note This is not a very good dataset seems like it has been translated
CohereLabs/aya_collection_language_split
Viewer
• Updated • 514M • 10.1k
• 116
NLPC-UOM/Sinhala-News-Category-classification
Viewer
• Updated • 3.33k • 130
• 1
NLPC-UOM/Sinhala-News-Source-classification
Viewer
• Updated • 24.1k • 20
Hamza-Ziyard/CNN-Daily-Mail-Sinhala
Viewer
• Updated • 10k • 65
• 3
Note News dataset not very good
9wimu9/sinhala_dataset_59m
Viewer
• Updated • 59.5M • 131
• 2
Note This is a raw text dataset Human Curated
9wimu9/sinhala_sentences_raw
Viewer
• Updated • 1.12k • 14
• 1
Note This dataset is mostly translated but not bad
9wimu9/sinhala_dataset_sanitized
Viewer
• Updated • 1.11k • 11
9wimu9/ada_derana_sinhala
Viewer
• Updated • 170k • 2
• 1
Suchinthana/Sinhala-QA-Translate
Viewer
• Updated • 1.02k • 30
• 2
Suchinthana/databricks-dolly-15k-sinhala
Viewer
• Updated • 15k • 29
• 2
Thimira/sinhala-llm-dataset-llama-prompt-format
Viewer
• Updated • 262k • 7
• 1