Sahabat-AI Indonesian News Analytics
Model fine-tuned untuk analisis teks komentar berita bahasa Indonesia dengan tiga tugas utama:
Tasks
- Sentiment Analysis - klasifikasi sentimen: positif, negatif, netral
- Hate Speech Detection - deteksi ujaran kebencian: hate, not_hate
- Stance Detection - deteksi sikap: favor, against, neutral
Model Details
- Base model: GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct
- Fine-tuning method: LoRA (r=16, alpha=16, dropout=0.1)
- Quantization: 4-bit (NF4)
Datasets Used
- NusaX โ Sentiment analysis
- IndoLEM โ Sentiment analysis
- Ialfina โ Hate speech detection
- Okky Ibrohim โ Hate speech detection
- Indonesian COVID-19 Vaccination Stance โ Stance detection
Training Configuration
- Epochs: 5 per dataset
- Batch size: 2
- Gradient accumulation steps: 4
- Learning rate: 2e-4
- Optimizer: AdamW 8-bit
- Weight decay: 0.01
- LR scheduler: Linear
- Warmup steps: 5
- Seed: 3407
Intended Use
Model ini dirancang untuk:
- Analisis sentimen berita dan opini publik berbahasa Indonesia
- Moderasi konten (deteksi ujaran kebencian)
- Analisis stance dalam diskusi publik
- Penelitian NLP bahasa Indonesia
Citation
@misc{sahabat-ai-news-analytics-2024,
author = {Ikhlamal},
title = {Sahabat-AI Indonesian News Analytics: Multi-task Text Classification for Indonesian},
year = {2024},
publisher = {HuggingFace},
howpublished = {\url{https://huggingface.co/ikhlamal/sahabat-ai-indonesian-news-analytics}}
}
Acknowledgments
- Base model: Sahabat-AI team (GoToCompany)
- Dataset contributors: NusaX, IndoLEM, Ialfina, Okky Ibrohim, dan Indonesian COVID-19 Vaccination Stance authors
License
Apache 2.0
Contact
Untuk pertanyaan atau masukan, buka issue di repositori model atau hubungi pemilik model.
Model tree for ikhlamal/sahabat-ai-indonesian-news-analytics
Base model
google/gemma-2-9b Finetuned
aisingapore/Gemma-SEA-LION-v3-9B