| license: apache-2.0 | |
| language: id | |
| library_name: transformers | |
| tags: [aksarallm, indonesian, from-scratch, smoke-test] | |
| # Ezekiel999/AksaraLLM-20B-Instruct (smoke-test checkpoint) | |
| **This is NOT the production 20B model.** It is a randomly-initialized | |
| `tiny` preset (2 layers, 64-dim, vocab 256) pushed from a Devin | |
| scaffolding session to validate the `aksaraLLMModel.save_pretrained` → | |
| HF → `aksaraLLMModel.from_pretrained` round-trip. | |
| The real 20B model (42 layers, 6144-dim, vocab 131 072) must be trained | |
| from random initialisation on a TPU v5p pod using | |
| `aksara-train/scripts/train_20b_pretrain.py`. | |