Instructions to use AksaraLLM/AksaraLLM-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AksaraLLM/AksaraLLM-20B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AksaraLLM/AksaraLLM-20B")
model = AutoModelForCausalLM.from_pretrained("AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AksaraLLM/AksaraLLM-20B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AksaraLLM/AksaraLLM-20B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AksaraLLM/AksaraLLM-20B

SGLang

How to use AksaraLLM/AksaraLLM-20B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AksaraLLM/AksaraLLM-20B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AksaraLLM/AksaraLLM-20B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AksaraLLM/AksaraLLM-20B with Docker Model Runner:
```
docker model run hf.co/AksaraLLM/AksaraLLM-20B
```

AksaraLLM-20B / docs /TRC_v5p_application.md

Ezekiel999

Add TRC v5p-128 application draft

bfd0211 verified 19 days ago

preview code

raw

history blame contribute delete

4.99 kB

TRC v5p-128 Application — AksaraLLM 20B

Apply at: https://sites.research.google/trc/about/ (click "Apply now"), or reply to your existing TRC onboarding email thread with the upgrade request.

Recommended ask: v5p-128 preemptible, 6 weeks, europe-west4-a (same zone as your current aksara-20b-v6e-8, keeps data-locality with gs://aksarallm20b-eu/).

Email body (copy-paste, edit the `[bracketed]` bits)

Subject: TRC upgrade request — v5p-128 for AksaraLLM 20B Indonesian pretrain (from v6e-8 current)

Hi TRC team,

I'm currently using aksara-20b-v6e-8 (europe-west4-a) under TRC and would like to request an upgrade to v5p-128 (preemptible, 6 weeks, europe-west4-a) for the pretrain phase of AksaraLLM 20B, a from-scratch Indonesian-first LLM. v6e-8 is sufficient for smoke tests and SFT but gives a 6–9 month wall-clock for a 20B pretrain on ~400–600B tokens, whereas v5p-128 lands that in 4–5 weeks at healthy MFU.

Project: AksaraLLM 20B — a LLaMA-3-style decoder-only transformer (GQA 48q/8kv, RoPE θ=1M, SwiGLU, RMSNorm, tied embeddings) targeting Indonesian, Malay, Javanese, and Sundanese with English and code as secondary. Dense 20.36B params, 8,192 train context extending to 131,072 at inference via YaRN.

Readiness evidence (already built on v6e-8):

Tokenizer live at https://huggingface.co/Ezekiel999/aksara-tokenizer-20b — 131,072 BPE vocab, fertility id=1.357, en=1.280, ms=1.368, jv=1.657 (all below targets)

Pretrain runner (EasyDeL / JAX / Flax NNX, SPMD mesh, Orbax checkpointing, W&B) validated end-to-end on v6e-8: 20-step smoke test with loss decreasing 11.83→11.61 at ~39k tok/s on a 200M proxy model, corpus streamed from gs://aksarallm20b-eu/smoke_parquet/

Corpus build pipeline (FineWeb + FineWeb-2-id + CulturaX + Indo4B + Dolma + The-Stack-v2, with fastText LID, Gopher quality filters, MinHash-LSH dedup, 13-gram decontamination against IndoMMLU/xCOPA/XNLI-id/TyDiQA-id/MMLU/HellaSwag/ARC/GSM8K) is in code; we will use v6e-8 to produce the 400–600B-token Parquet corpus under gs://aksarallm20b-eu/pretrain/ while we wait for v5p.

GCP project aksarallm-tpu, co-located EU bucket gs://aksarallm20b-eu/ (12.16 GB sample corpus already uploaded)

Repository: https://github.com/cahyohackids/AksaraLLM (branch devin/1776993538-20b-pipeline-fixes)

Compute plan for v5p-128:

Phase 1 pretrain: 200k steps × 2 Mi tokens/step = 419B tokens at 8k context, ~4.5 weeks wall-clock at ~45% MFU

Phase 2 YaRN context extension: 10k steps at 32k context, ~4 days

Eval + smoke SFT validation: 2 days

Recovery plan for preemption: Orbax async sharded checkpoints every 500 steps (∼1h) to gs://aksarallm20b-eu/ckpt/, automatic resume. Expected preempt cost under 10% of wall-clock.

Open-source deliverables: Apache-2.0 base weights, SFT+DPO variants, technical report on Hugging Face AksaraLLM/ org. First sizable Indonesian from-scratch 20B, explicitly covering JV/SU/MS tails that are underrepresented in current multilingual models.

Grateful for the v6e-8 access so far — the readiness work above was all done on it. Happy to share W&B run logs for the smoke test if useful.

Thanks, [Your name] [Affiliation / lab / company] GitHub: https://github.com/cahyohackids Hugging Face: https://huggingface.co/AksaraLLM

Readiness packet (attach or link in the email)

Artifact	Link / Location
Tokenizer	https://huggingface.co/Ezekiel999/aksara-tokenizer-20b
Architecture config	`configs/aksara_20b_dense.json` on branch
Pretrain runner	`scripts/train_20b_pretrain.py` on branch
Corpus builder	`scripts/build_pretrain_corpus_v2.py` on branch
Preflight gates	`scripts/preflight_20b.py` on branch
Execution plan	`docs/aksara_20b_execution_plan.md` on branch
Smoke-test log excerpt	`step=0 loss=11.83 tok/s=33k`, `step=10 loss=11.61 tok/s=40k`, clean exit
Current TPU	`aksara-20b-v6e-8`, europe-west4-a, READY
Bucket (co-located)	`gs://aksarallm20b-eu/` (12.16 GB sample corpus + tokenizer + smoke parquet)

Tips for approval

Emphasize Indonesian-first + underrepresented SEA languages. TRC is more likely to approve open-science projects serving underrepresented languages than yet-another-English-LLM.
Show the work is already ready to run — you have the tokenizer, the runner, and a validated smoke test. The ask is scale-out, not research.
Preemptible is easier to get approved than on-demand. The runner already has resume logic so this is OK.
6 weeks is the honest ask. Asking for 12 weeks will get declined or trimmed; 4 weeks is too tight to include margin for preempt & YaRN phase.
Co-locate with europe-west4-a. You already have aksara-20b-v6e-8 there and gs://aksarallm20b-eu/. Don't ask for us-east or us-central — the TRC team prefers not to spread one project across zones.

TRC v5p-128 Application — AksaraLLM 20B

Email body (copy-paste, edit the [bracketed] bits)

Readiness packet (attach or link in the email)

Tips for approval

Email body (copy-paste, edit the `[bracketed]` bits)