[Devin Audit] add HF YAML front-matter (language, license, base_model, tags) for discoverability
847a962 verified metadata
language:
- id
license: apache-2.0
library_name: transformers
pipeline_tag: text-generation
tags:
- indonesian
- aksarallm
- archived
- research
Kiel-59M-Matured
⚠️ Status: early experiment. This 85M-parameter decoder-only transformer was trained from scratch as part of the early AksaraLLM line. It uses the GPT-2 BPE tokenizer (50257 vocab) which is not optimal for Indonesian, and the training corpus was limited. By standard perplexity it is not a usable Indonesian language model today.
Architecture
| Property | Value |
|---|---|
| Parameters | 85.0M |
| Layers | 8 |
| Heads | 8 |
| Hidden size | 512 |
| FFN size | 2048 |
| Vocabulary | 50257 (GPT-2 BPE) |
| Context length | 256 |
| RMSNorm + RoPE + SwiGLU | yes |
Measured baseline (Devin audit, CPU eval)
- Perplexity (50 ID sentences, GPT-2 tokenizer): 23154 (very high — model not converged)
- English-stopword ratio in ID-prompted output: 0.0%
- Indonesian-stopword ratio in ID-prompted output: 0.0%
For comparison, the working Indonesian models in this org reach perplexity ≈ 8–15 on the same 50-sentence eval set.
Sample for "Indonesia adalah negara":
Indonesia adalah negaraalum questionich4!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Why the previous "Skor 10/11 Grade S" is misleading
That figure is from a custom 11-question in-house scorecard, not from a standard LM evaluation. Perplexity on plain Indonesian text reveals that this checkpoint cannot model the distribution.
Limitations
- Wrong tokenizer for the language: GPT-2 BPE is optimised for English.
- Severely under-trained at this size + corpus.
- No chat template in tokenizer config; treat as a base LM only.
What to use instead
AksaraLLM/Kiel-Pro-0.5B-v3— 494M Qwen2-based, PPL ≈ 15.AksaraLLM/AksaraLLM-Qwen-1.5B-v5-public— 1.78B Qwen2-based, PPL ≈ 8.4.
License
Apache 2.0