Text Generation
PEFT
Safetensors
Transformers
qwen3
lora
sft
trl
lm-eval
bakat
indonesian
conversational
text-generation-inference
Instructions to use aitf-komdigi/KomdigiUB-8B-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use aitf-komdigi/KomdigiUB-8B-Base with PEFT:
Task type is invalid.
- Transformers
How to use aitf-komdigi/KomdigiUB-8B-Base with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="aitf-komdigi/KomdigiUB-8B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("aitf-komdigi/KomdigiUB-8B-Base") model = AutoModelForCausalLM.from_pretrained("aitf-komdigi/KomdigiUB-8B-Base") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use aitf-komdigi/KomdigiUB-8B-Base with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "aitf-komdigi/KomdigiUB-8B-Base" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/aitf-komdigi/KomdigiUB-8B-Base
- SGLang
How to use aitf-komdigi/KomdigiUB-8B-Base with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "aitf-komdigi/KomdigiUB-8B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "aitf-komdigi/KomdigiUB-8B-Base" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "aitf-komdigi/KomdigiUB-8B-Base", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use aitf-komdigi/KomdigiUB-8B-Base with Docker Model Runner:
docker model run hf.co/aitf-komdigi/KomdigiUB-8B-Base
| base_model: | |
| - Qwen/Qwen3-8B | |
| library_name: peft | |
| pipeline_tag: text-generation | |
| anguage: | |
| - id | |
| tags: | |
| - base_model:Qwen/Qwen3-8B | |
| - lora | |
| - sft | |
| - transformers | |
| - trl | |
| - lm-eval | |
| - bakat | |
| - indonesian | |
| license: apache-2.0 | |
| datasets: | |
| - internal-curated | |
| # Bakat-8B-Base | |
| ## Model Details | |
| ### Model Description | |
| **Bakat-8B-Base** adalah base model bahasa Indonesia yang dirancang untuk **Continued Pre-Training (CPT)** pada domain kebijakan dan pengawasan ruang digital. Model ini dibangun di atas arsitektur **Qwen3-8B**, dengan pendekatan **LoRA (Low-Rank Adaptation)** dan **4-bit quantization** untuk efisiensi memori dan komputasi. | |
| * **Developed by**: Tim 1 AITF | |
| * **Model type**: Causal Language Model (LoRA Adapter) | |
| * **Base architecture**: Qwen3-8B | |
| * **Primary language**: Indonesian (id) | |
| * **License**: Apache-2.0 | |
| --- | |
| ## Training Data Composition | |
| | Kategori | Elemen | Jumlah Token (M) | Persentase | | |
| | ---------------- | ----------------------------------------------------------------------------------------------------- | ---------------- | ---------- | | |
| | **DTP** | Okupasi PON TIK, Tren Pekerjaan, Kompetensi & SDM, Kebijakan & Regulasi DTP, Teknologi Digital Talent | 94 | 43.9% | | |
| | **PRD** | Judi Online, Hoax, Perlindungan Anak, Konten Edukasi, Kebijakan & Regulasi PRD, Kekerasan Masyarakat | 92 | 42.9% | | |
| | **Wikipedia ID** | Pengetahuan Umum Berbahasa Indonesia | 28.2 | 13.2% | | |
| | **Total** | – | **214.2** | **100%** | | |
| --- | |
| ## Intended Use | |
| ### Direct Use (Recommended) | |
| Model ini **ditujukan untuk Continued Pre-Training**, khususnya untuk: | |
| * Adaptasi domain kebijakan publik dan regulasi digital | |
| * Pengayaan pengetahuan spesifik Indonesia | |
| * Pre-adaptation sebelum Instruction Tuning atau SFT | |
| ### Out-of-Scope Use | |
| * **Long-context conversations** (belum dioptimalkan) | |
| * **High-stakes decision making** (legal, medis, finansial) | |
| * **Chat-oriented instruction following** tanpa fine-tuning lanjutan | |
| --- | |
| ## Bias, Risks, and Limitations | |
| * Dataset didominasi oleh domain kebijakan dan pengawasan ruang digital, sehingga bias topikal dapat muncul pada domain non-terkait. | |
| * Model belum melalui tahap preference alignment (RLHF/DPO). | |
| * Konten Wikipedia digunakan sebagai penyeimbang, namun tidak menjamin netralitas penuh. | |
| Pengguna disarankan melakukan evaluasi tambahan sebelum penggunaan produksi. | |
| --- | |
| ## Recommendations | |
| * Gunakan **Qwen3 chat template** untuk hasil generasi terbaik. | |
| * Lakukan **Instruction Fine-Tuning** atau **Preference Tuning** sebelum deployment ke end-user. | |
| * Verifikasi keluaran model untuk informasi kritikal. | |
| --- | |
| ## How to Get Started | |
| Load the model using **HuggingFace Transformers**: | |
| ```python | |
| import torch | |
| from transformers import AutoTokenizer, AutoModelForCausalLM | |
| # 1. Configuration | |
| model_id = "aitfindonesia/Bakat-8B-Base" # Replace with your actual Hub ID | |
| # 2. Load Model | |
| # Use bfloat16 for A100/A10G, float16 for T4 | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| torch_dtype=torch.bfloat16, | |
| device_map="auto" | |
| ) | |
| # 3. Inference Example (Completion) | |
| input_text = "Strategi utama untuk mengurangi gap talenta digital di Indonesia adalah" | |
| inputs = tokenizer(input_text, return_tensors="pt").to("cuda") | |
| with torch.no_grad(): | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=100, | |
| do_sample=True, | |
| temperature=0.7 | |
| ) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| --- | |
| ## Training Details | |
| ### Training Data | |
| * **Total size**: ~214M tokens | |
| * **Domains**: Digital Talent Policy (DTP), Pengawasan Ruang Digital (PRD), Wikipedia Indonesia | |
| * **Split**: Train (90%) / Validation (10%) | |
| ### Training Procedure | |
| Model dilatih menggunakan **Continued Pre-Training (CPT)** dengan LoRA pada HuggingFace Transformers. | |
| #### Hyperparameters | |
| * **Precision**: bf16 (mixed precision) | |
| * **Quantization**: 4-bit (nf4) | |
| * **LoRA Rank (r)**: 8 | |
| * **LoRA Alpha**: 16 | |
| * **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | |
| * **Batch size**: 4 / device | |
| * **Gradient accumulation**: 16 (effective batch size = 32) | |
| * **Learning rate**: 2e-4 (linear schedule) | |
| * **Warmup ratio**: 0.03 | |
| * **Epochs**: 1 | |
| * **Optimizer**: adamw_8bit | |
| --- | |
| ## Evaluation | |
| ### Results | |
| * **Final Training Loss**: ~1.2685 | |
| * **Final Validation Loss**: ~1.264 | |
| * **Training Perplexity**: ~3.56 | |
| * **Validation Perplexity**: ~3.55 | |
| ### Benchmark (General) | |
| * **MMLU**: ~74.20 | |
| * **IndoMMLU**: ~65.66 | |
| * **XCOPA-ID**: ~75.80 | |
| --- | |
| ## Environmental Impact | |
| Estimasi emisi karbon mengikuti metodologi Lacoste et al. (2019). | |
| * **Hardware**: NVIDIA A100 80GB | |
| * **Training time**: ~36 jam | |
| * **Compute region**: Indonesia | |
| * **Infrastructure**: University / Private Server | |
| --- | |
| ## Framework Versions | |
| * Transformers: 4.x | |
| * PyTorch: 2.x | |
| * Datasets: 2.x | |
| * Tokenizers: 0.x |