Text Generation
PEFT
Safetensors
Transformers
qwen3
lora
sft
trl
lm-eval
bakat
indonesian
conversational
text-generation-inference
Instructions to use aitf-komdigi/KomdigiUB-8B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use aitf-komdigi/KomdigiUB-8B-Base with PEFT:
Task type is invalid.
- Transformers
How to use aitf-komdigi/KomdigiUB-8B-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aitf-komdigi/KomdigiUB-8B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aitf-komdigi/KomdigiUB-8B-Base") model = AutoModelForCausalLM.from_pretrained("aitf-komdigi/KomdigiUB-8B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aitf-komdigi/KomdigiUB-8B-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aitf-komdigi/KomdigiUB-8B-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/aitf-komdigi/KomdigiUB-8B-Base
- SGLang
How to use aitf-komdigi/KomdigiUB-8B-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aitf-komdigi/KomdigiUB-8B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aitf-komdigi/KomdigiUB-8B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use aitf-komdigi/KomdigiUB-8B-Base with Docker Model Runner:
docker model run hf.co/aitf-komdigi/KomdigiUB-8B-Base
File size: 5,194 Bytes
7ba3034 aeb6289 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 | ---
base_model:
- Qwen/Qwen3-8B
library_name: peft
pipeline_tag: text-generation
anguage:
- id
tags:
- base_model:Qwen/Qwen3-8B
- lora
- sft
- transformers
- trl
- lm-eval
- bakat
- indonesian
license: apache-2.0
datasets:
- internal-curated
---
# Bakat-8B-Base
## Model Details
### Model Description
**Bakat-8B-Base** adalah base model bahasa Indonesia yang dirancang untuk **Continued Pre-Training (CPT)** pada domain kebijakan dan pengawasan ruang digital. Model ini dibangun di atas arsitektur **Qwen3-8B**, dengan pendekatan **LoRA (Low-Rank Adaptation)** dan **4-bit quantization** untuk efisiensi memori dan komputasi.
* **Developed by**: Tim 1 AITF
* **Model type**: Causal Language Model (LoRA Adapter)
* **Base architecture**: Qwen3-8B
* **Primary language**: Indonesian (id)
* **License**: Apache-2.0
---
## Training Data Composition
| Kategori | Elemen | Jumlah Token (M) | Persentase |
| ---------------- | ----------------------------------------------------------------------------------------------------- | ---------------- | ---------- |
| **DTP** | Okupasi PON TIK, Tren Pekerjaan, Kompetensi & SDM, Kebijakan & Regulasi DTP, Teknologi Digital Talent | 94 | 43.9% |
| **PRD** | Judi Online, Hoax, Perlindungan Anak, Konten Edukasi, Kebijakan & Regulasi PRD, Kekerasan Masyarakat | 92 | 42.9% |
| **Wikipedia ID** | Pengetahuan Umum Berbahasa Indonesia | 28.2 | 13.2% |
| **Total** | – | **214.2** | **100%** |
---
## Intended Use
### Direct Use (Recommended)
Model ini **ditujukan untuk Continued Pre-Training**, khususnya untuk:
* Adaptasi domain kebijakan publik dan regulasi digital
* Pengayaan pengetahuan spesifik Indonesia
* Pre-adaptation sebelum Instruction Tuning atau SFT
### Out-of-Scope Use
* **Long-context conversations** (belum dioptimalkan)
* **High-stakes decision making** (legal, medis, finansial)
* **Chat-oriented instruction following** tanpa fine-tuning lanjutan
---
## Bias, Risks, and Limitations
* Dataset didominasi oleh domain kebijakan dan pengawasan ruang digital, sehingga bias topikal dapat muncul pada domain non-terkait.
* Model belum melalui tahap preference alignment (RLHF/DPO).
* Konten Wikipedia digunakan sebagai penyeimbang, namun tidak menjamin netralitas penuh.
Pengguna disarankan melakukan evaluasi tambahan sebelum penggunaan produksi.
---
## Recommendations
* Gunakan **Qwen3 chat template** untuk hasil generasi terbaik.
* Lakukan **Instruction Fine-Tuning** atau **Preference Tuning** sebelum deployment ke end-user.
* Verifikasi keluaran model untuk informasi kritikal.
---
## How to Get Started
Load the model using **HuggingFace Transformers**:
```python
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
# 1. Configuration
model_id = "aitfindonesia/Bakat-8B-Base" # Replace with your actual Hub ID
# 2. Load Model
# Use bfloat16 for A100/A10G, float16 for T4
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# 3. Inference Example (Completion)
input_text = "Strategi utama untuk mengurangi gap talenta digital di Indonesia adalah"
inputs = tokenizer(input_text, return_tensors="pt").to("cuda")
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=100,
do_sample=True,
temperature=0.7
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```
---
## Training Details
### Training Data
* **Total size**: ~214M tokens
* **Domains**: Digital Talent Policy (DTP), Pengawasan Ruang Digital (PRD), Wikipedia Indonesia
* **Split**: Train (90%) / Validation (10%)
### Training Procedure
Model dilatih menggunakan **Continued Pre-Training (CPT)** dengan LoRA pada HuggingFace Transformers.
#### Hyperparameters
* **Precision**: bf16 (mixed precision)
* **Quantization**: 4-bit (nf4)
* **LoRA Rank (r)**: 8
* **LoRA Alpha**: 16
* **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
* **Batch size**: 4 / device
* **Gradient accumulation**: 16 (effective batch size = 32)
* **Learning rate**: 2e-4 (linear schedule)
* **Warmup ratio**: 0.03
* **Epochs**: 1
* **Optimizer**: adamw_8bit
---
## Evaluation
### Results
* **Final Training Loss**: ~1.2685
* **Final Validation Loss**: ~1.264
* **Training Perplexity**: ~3.56
* **Validation Perplexity**: ~3.55
### Benchmark (General)
* **MMLU**: ~74.20
* **IndoMMLU**: ~65.66
* **XCOPA-ID**: ~75.80
---
## Environmental Impact
Estimasi emisi karbon mengikuti metodologi Lacoste et al. (2019).
* **Hardware**: NVIDIA A100 80GB
* **Training time**: ~36 jam
* **Compute region**: Indonesia
* **Infrastructure**: University / Private Server
---
## Framework Versions
* Transformers: 4.x
* PyTorch: 2.x
* Datasets: 2.x
* Tokenizers: 0.x |