Instructions to use AksaraLLM/AksaraLLM-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use AksaraLLM/AksaraLLM-20B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("AksaraLLM/AksaraLLM-20B")
model = AutoModelForCausalLM.from_pretrained("AksaraLLM/AksaraLLM-20B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use AksaraLLM/AksaraLLM-20B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "AksaraLLM/AksaraLLM-20B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/AksaraLLM/AksaraLLM-20B

SGLang

How to use AksaraLLM/AksaraLLM-20B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "AksaraLLM/AksaraLLM-20B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "AksaraLLM/AksaraLLM-20B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "AksaraLLM/AksaraLLM-20B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use AksaraLLM/AksaraLLM-20B with Docker Model Runner:
```
docker model run hf.co/AksaraLLM/AksaraLLM-20B
```

AksaraLLM-20B / docs /TRC_v5p_application.md

Ezekiel999

Add TRC v5p-128 application draft

bfd0211 verified 30 days ago

preview code

raw

history blame contribute delete

4.99 kB

	# TRC v5p-128 Application — AksaraLLM 20B

	Apply at: https://sites.research.google/trc/about/ (click "Apply now"), or reply to your existing TRC onboarding email thread with the upgrade request.

	Recommended ask: v5p-128 preemptible, 6 weeks, `europe-west4-a` (same zone as your current `aksara-20b-v6e-8`, keeps data-locality with `gs://aksarallm20b-eu/`).

	---

	## Email body (copy-paste, edit the `[bracketed]` bits)

	> Subject: TRC upgrade request — v5p-128 for AksaraLLM 20B Indonesian pretrain (from v6e-8 current)
	>
	> Hi TRC team,
	>
	> I'm currently using `aksara-20b-v6e-8` (europe-west4-a) under TRC and would like to request an upgrade to v5p-128 (preemptible, 6 weeks, europe-west4-a) for the pretrain phase of AksaraLLM 20B, a from-scratch Indonesian-first LLM. v6e-8 is sufficient for smoke tests and SFT but gives a 6–9 month wall-clock for a 20B pretrain on ~400–600B tokens, whereas v5p-128 lands that in 4–5 weeks at healthy MFU.
	>
	> Project: AksaraLLM 20B — a LLaMA-3-style decoder-only transformer (GQA 48q/8kv, RoPE θ=1M, SwiGLU, RMSNorm, tied embeddings) targeting Indonesian, Malay, Javanese, and Sundanese with English and code as secondary. Dense 20.36B params, 8,192 train context extending to 131,072 at inference via YaRN.
	>
	> Readiness evidence (already built on v6e-8):
	> - Tokenizer live at https://huggingface.co/Ezekiel999/aksara-tokenizer-20b — 131,072 BPE vocab, fertility id=1.357, en=1.280, ms=1.368, jv=1.657 (all below targets)
	> - Pretrain runner (EasyDeL / JAX / Flax NNX, SPMD mesh, Orbax checkpointing, W&B) validated end-to-end on v6e-8: 20-step smoke test with loss decreasing 11.83→11.61 at ~39k tok/s on a 200M proxy model, corpus streamed from `gs://aksarallm20b-eu/smoke_parquet/`
	> - Corpus build pipeline (FineWeb + FineWeb-2-id + CulturaX + Indo4B + Dolma + The-Stack-v2, with fastText LID, Gopher quality filters, MinHash-LSH dedup, 13-gram decontamination against IndoMMLU/xCOPA/XNLI-id/TyDiQA-id/MMLU/HellaSwag/ARC/GSM8K) is in code; we will use v6e-8 to produce the 400–600B-token Parquet corpus under `gs://aksarallm20b-eu/pretrain/` while we wait for v5p.
	> - GCP project `aksarallm-tpu`, co-located EU bucket `gs://aksarallm20b-eu/` (12.16 GB sample corpus already uploaded)
	> - Repository: https://github.com/cahyohackids/AksaraLLM (branch `devin/1776993538-20b-pipeline-fixes`)
	>
	> Compute plan for v5p-128:
	> - Phase 1 pretrain: 200k steps × 2 Mi tokens/step = 419B tokens at 8k context, ~4.5 weeks wall-clock at ~45% MFU
	> - Phase 2 YaRN context extension: 10k steps at 32k context, ~4 days
	> - Eval + smoke SFT validation: 2 days
	>
	> Recovery plan for preemption: Orbax async sharded checkpoints every 500 steps (∼1h) to `gs://aksarallm20b-eu/ckpt/`, automatic resume. Expected preempt cost under 10% of wall-clock.
	>
	> Open-source deliverables: Apache-2.0 base weights, SFT+DPO variants, technical report on Hugging Face `AksaraLLM/` org. First sizable Indonesian from-scratch 20B, explicitly covering JV/SU/MS tails that are underrepresented in current multilingual models.
	>
	> Grateful for the v6e-8 access so far — the readiness work above was all done on it. Happy to share W&B run logs for the smoke test if useful.
	>
	> Thanks,
	> [Your name]
	> [Affiliation / lab / company]
	> GitHub: https://github.com/cahyohackids
	> Hugging Face: https://huggingface.co/AksaraLLM

	---

	## Readiness packet (attach or link in the email)

	\| Artifact \| Link / Location \|
	\|---\|---\|
	\| Tokenizer \| https://huggingface.co/Ezekiel999/aksara-tokenizer-20b \|
	\| Architecture config \| `configs/aksara_20b_dense.json` on branch \|
	\| Pretrain runner \| `scripts/train_20b_pretrain.py` on branch \|
	\| Corpus builder \| `scripts/build_pretrain_corpus_v2.py` on branch \|
	\| Preflight gates \| `scripts/preflight_20b.py` on branch \|
	\| Execution plan \| `docs/aksara_20b_execution_plan.md` on branch \|
	\| Smoke-test log excerpt \| `step=0 loss=11.83 tok/s=33k`, `step=10 loss=11.61 tok/s=40k`, clean exit \|
	\| Current TPU \| `aksara-20b-v6e-8`, europe-west4-a, READY \|
	\| Bucket (co-located) \| `gs://aksarallm20b-eu/` (12.16 GB sample corpus + tokenizer + smoke parquet) \|

	---

	## Tips for approval

	1. Emphasize Indonesian-first + underrepresented SEA languages. TRC is more likely to approve open-science projects serving underrepresented languages than yet-another-English-LLM.
	2. Show the work is already ready to run — you have the tokenizer, the runner, and a validated smoke test. The ask is scale-out, not research.
	3. Preemptible is easier to get approved than on-demand. The runner already has resume logic so this is OK.
	4. 6 weeks is the honest ask. Asking for 12 weeks will get declined or trimmed; 4 weeks is too tight to include margin for preempt & YaRN phase.
	5. Co-locate with europe-west4-a. You already have `aksara-20b-v6e-8` there and `gs://aksarallm20b-eu/`. Don't ask for us-east or us-central — the TRC team prefers not to spread one project across zones.

	# TRC v5p-128 Application — AksaraLLM 20B

	Apply at: https://sites.research.google/trc/about/ (click "Apply now"), or reply to your existing TRC onboarding email thread with the upgrade request.

	Recommended ask: v5p-128 preemptible, 6 weeks, `europe-west4-a` (same zone as your current `aksara-20b-v6e-8`, keeps data-locality with `gs://aksarallm20b-eu/`).

	---

	## Email body (copy-paste, edit the `[bracketed]` bits)

	> Subject: TRC upgrade request — v5p-128 for AksaraLLM 20B Indonesian pretrain (from v6e-8 current)
	>
	> Hi TRC team,
	>
	> I'm currently using `aksara-20b-v6e-8` (europe-west4-a) under TRC and would like to request an upgrade to v5p-128 (preemptible, 6 weeks, europe-west4-a) for the pretrain phase of AksaraLLM 20B, a from-scratch Indonesian-first LLM. v6e-8 is sufficient for smoke tests and SFT but gives a 6–9 month wall-clock for a 20B pretrain on ~400–600B tokens, whereas v5p-128 lands that in 4–5 weeks at healthy MFU.
	>
	> Project: AksaraLLM 20B — a LLaMA-3-style decoder-only transformer (GQA 48q/8kv, RoPE θ=1M, SwiGLU, RMSNorm, tied embeddings) targeting Indonesian, Malay, Javanese, and Sundanese with English and code as secondary. Dense 20.36B params, 8,192 train context extending to 131,072 at inference via YaRN.
	>
	> Readiness evidence (already built on v6e-8):
	> - Tokenizer live at https://huggingface.co/Ezekiel999/aksara-tokenizer-20b — 131,072 BPE vocab, fertility id=1.357, en=1.280, ms=1.368, jv=1.657 (all below targets)
	> - Pretrain runner (EasyDeL / JAX / Flax NNX, SPMD mesh, Orbax checkpointing, W&B) validated end-to-end on v6e-8: 20-step smoke test with loss decreasing 11.83→11.61 at ~39k tok/s on a 200M proxy model, corpus streamed from `gs://aksarallm20b-eu/smoke_parquet/`
	> - Corpus build pipeline (FineWeb + FineWeb-2-id + CulturaX + Indo4B + Dolma + The-Stack-v2, with fastText LID, Gopher quality filters, MinHash-LSH dedup, 13-gram decontamination against IndoMMLU/xCOPA/XNLI-id/TyDiQA-id/MMLU/HellaSwag/ARC/GSM8K) is in code; we will use v6e-8 to produce the 400–600B-token Parquet corpus under `gs://aksarallm20b-eu/pretrain/` while we wait for v5p.
	> - GCP project `aksarallm-tpu`, co-located EU bucket `gs://aksarallm20b-eu/` (12.16 GB sample corpus already uploaded)
	> - Repository: https://github.com/cahyohackids/AksaraLLM (branch `devin/1776993538-20b-pipeline-fixes`)
	>
	> Compute plan for v5p-128:
	> - Phase 1 pretrain: 200k steps × 2 Mi tokens/step = 419B tokens at 8k context, ~4.5 weeks wall-clock at ~45% MFU
	> - Phase 2 YaRN context extension: 10k steps at 32k context, ~4 days
	> - Eval + smoke SFT validation: 2 days
	>
	> Recovery plan for preemption: Orbax async sharded checkpoints every 500 steps (∼1h) to `gs://aksarallm20b-eu/ckpt/`, automatic resume. Expected preempt cost under 10% of wall-clock.
	>
	> Open-source deliverables: Apache-2.0 base weights, SFT+DPO variants, technical report on Hugging Face `AksaraLLM/` org. First sizable Indonesian from-scratch 20B, explicitly covering JV/SU/MS tails that are underrepresented in current multilingual models.
	>
	> Grateful for the v6e-8 access so far — the readiness work above was all done on it. Happy to share W&B run logs for the smoke test if useful.
	>
	> Thanks,
	> [Your name]
	> [Affiliation / lab / company]
	> GitHub: https://github.com/cahyohackids
	> Hugging Face: https://huggingface.co/AksaraLLM

	---

	## Readiness packet (attach or link in the email)

	\| Artifact \| Link / Location \|
	\|---\|---\|
	\| Tokenizer \| https://huggingface.co/Ezekiel999/aksara-tokenizer-20b \|
	\| Architecture config \| `configs/aksara_20b_dense.json` on branch \|
	\| Pretrain runner \| `scripts/train_20b_pretrain.py` on branch \|
	\| Corpus builder \| `scripts/build_pretrain_corpus_v2.py` on branch \|
	\| Preflight gates \| `scripts/preflight_20b.py` on branch \|
	\| Execution plan \| `docs/aksara_20b_execution_plan.md` on branch \|
	\| Smoke-test log excerpt \| `step=0 loss=11.83 tok/s=33k`, `step=10 loss=11.61 tok/s=40k`, clean exit \|
	\| Current TPU \| `aksara-20b-v6e-8`, europe-west4-a, READY \|
	\| Bucket (co-located) \| `gs://aksarallm20b-eu/` (12.16 GB sample corpus + tokenizer + smoke parquet) \|

	---

	## Tips for approval

	1. Emphasize Indonesian-first + underrepresented SEA languages. TRC is more likely to approve open-science projects serving underrepresented languages than yet-another-English-LLM.
	2. Show the work is already ready to run — you have the tokenizer, the runner, and a validated smoke test. The ask is scale-out, not research.
	3. Preemptible is easier to get approved than on-demand. The runner already has resume logic so this is OK.
	4. 6 weeks is the honest ask. Asking for 12 weeks will get declined or trimmed; 4 weeks is too tight to include margin for preempt & YaRN phase.
	5. Co-locate with europe-west4-a. You already have `aksara-20b-v6e-8` there and `gs://aksarallm20b-eu/`. Don't ask for us-east or us-central — the TRC team prefers not to spread one project across zones.