Instructions to use AksaraLLM/AksaraLLM-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AksaraLLM/AksaraLLM-20B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AksaraLLM/AksaraLLM-20B")
model = AutoModelForCausalLM.from_pretrained("AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AksaraLLM/AksaraLLM-20B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AksaraLLM/AksaraLLM-20B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AksaraLLM/AksaraLLM-20B

SGLang

How to use AksaraLLM/AksaraLLM-20B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AksaraLLM/AksaraLLM-20B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AksaraLLM/AksaraLLM-20B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AksaraLLM/AksaraLLM-20B with Docker Model Runner:
```
docker model run hf.co/AksaraLLM/AksaraLLM-20B
```

AksaraLLM-20B / README.md

Ezekiel999

Publish architecture config + tokenizer + roadmap (weights pending v5p-128 pretrain)

88202f1 verified 17 days ago

preview code

raw

history blame contribute delete

2.86 kB

	---
	license: apache-2.0
	language:
	- id
	- ms
	- jv
	- su
	- en
	tags:
	- aksarallm
	- indonesian
	- llama
	- from-scratch
	- pretraining
	library_name: transformers
	pipeline_tag: text-generation
	---

	# AksaraLLM 20B (dense)

	> Status: architecture + tokenizer published. Weights are NOT YET trained.
	> This repository currently holds the architecture config and tokenizer. The
	> from-scratch pretraining run is blocked on TRC v5p-128 approval; see
	> [Roadmap](#roadmap) below.

	AksaraLLM 20B is a from-scratch, Indonesian-first decoder-only transformer
	designed to serve Indonesian (`id`), Malay (`ms`), Javanese (`jv`), Sundanese
	(`su`), with English (`en`) and source code as secondary.

	## Architecture

	\| Field \| Value \|
	\|---\|---\|
	\| Family \| LLaMA-3-style decoder-only transformer \|
	\| Parameters \| 20,359,673,856 (20.36 B, with tied embeddings) \|
	\| Hidden size \| 6,144 \|
	\| FFN inner \| 20,480 (SwiGLU) \|
	\| Layers \| 42 \|
	\| Attention heads \| 48 query / 8 KV (GQA, 6:1) \|
	\| Head dim \| 128 \|
	\| Vocab \| 131,072 (BPE byte-level) \|
	\| Positional \| RoPE, θ = 1,000,000 \|
	\| Context (pretrain) \| 8,192 \|
	\| Context (YaRN extend) \| 32,768 \|
	\| Context (inference target) \| 131,072 \|
	\| Norm \| RMSNorm \|
	\| Embeddings \| tied \|

	## Tokenizer

	The tokenizer is already published at
	[`Ezekiel999/aksara-tokenizer-20b`](https://huggingface.co/Ezekiel999/aksara-tokenizer-20b)
	and mirrored here.

	Fertility (held-out samples):

	\| Language \| Source \| tokens/word \| Target \|
	\|---\|---\|---\|---\|
	\| English \| FineWeb \| 1.280 \| ≤ 1.40 \|
	\| Indonesian \| Wikipedia \| 1.357 \| ≤ 1.60 \|
	\| Indonesian \| CulturaX web \| 1.215 \| ≤ 1.60 \|
	\| Malay \| Wikipedia \| 1.368 \| ≤ 1.60 \|
	\| Javanese \| Wikipedia \| 1.657 \| ≤ 1.80 \|

	## Roadmap

	\| Phase \| Status \| Compute \| Target date \|
	\|---\|---\|---\|---\|
	\| 1. Architecture + tokenizer \| ✅ Done \| CPU \| 2026-04 \|
	\| 2. Corpus build (400–600B tokens) \| 🔄 in progress \| v6e-8 \| 2026-05 \|
	\| 3. Pretrain phase 1 (8k context, 400B tokens) \| ⏸ blocked on TRC v5p-128 \| v5p-128, 4–5 weeks \| 2026-06 \|
	\| 4. YaRN context extension (32k) \| pending \| v5p-128, ~4 days \| 2026-07 \|
	\| 5. SFT \| pending \| v5p-64 or v6e-8 \| 2026-07 \|
	\| 6. DPO / ORPO \| pending \| v5p-64 or v6e-8 \| 2026-07 \|
	\| 7. Eval + release (GGUF) \| pending \| CPU \| 2026-08 \|

	## Usage (tokenizer only)

	```python
	from transformers import AutoTokenizer
	tok = AutoTokenizer.from_pretrained("Ezekiel999/aksara-tokenizer-20b")
	print(tok("Halo AksaraLLM", add_special_tokens=False).input_ids)
	```

	Weights will be published here once pretraining completes.

	## Citation

	```bibtex
	@misc{aksarallm2026,
	title = {AksaraLLM 20B: A From-Scratch Indonesian-First Language Model},
	author = {AksaraLLM Team},
	year = {2026},
	url = {https://huggingface.co/AksaraLLM/AksaraLLM-20B}
	}
	```

	## License

	Apache-2.0. Pre-training data attribution will be documented with the final weights.