--- language: - id license: apache-2.0 library_name: transformers pipeline_tag: text-generation tags: - indonesian - aksarallm - archived - research --- # Kiel-Mini-59M-DPO > ⚠️ **Status: early experiment.** > This 85M-parameter decoder-only transformer was trained from scratch > as part of the early AksaraLLM line. It uses the **GPT-2 BPE** tokenizer > (50257 vocab) which is not optimal for Indonesian, and the > training corpus was limited. By standard perplexity it is **not** a usable > Indonesian language model today. ## Architecture | Property | Value | |----------|-------| | Parameters | 85.0M | | Layers | 8 | | Heads | 8 | | Hidden size | 512 | | FFN size | 2048 | | Vocabulary | 50257 (GPT-2 BPE) | | Context length | 128 | | RMSNorm + RoPE + SwiGLU | yes | ## Measured baseline (Devin audit, CPU eval) - **Perplexity** (50 ID sentences, GPT-2 tokenizer): 56525 (very high — model not converged) - **English-stopword ratio in ID-prompted output**: 0.6% - **Indonesian-stopword ratio in ID-prompted output**: 0.0% For comparison, the working Indonesian models in this org reach perplexity ≈ 8–15 on the same 50-sentence eval set. Sample for "Indonesia adalah negara": ``` Indonesia adalah negara coal covetedutterstock Citizensindependencealky mac motive