Instructions to use AksaraLLM/AksaraLLM-20B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AksaraLLM/AksaraLLM-20B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AksaraLLM/AksaraLLM-20B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("AksaraLLM/AksaraLLM-20B") model = AutoModelForCausalLM.from_pretrained("AksaraLLM/AksaraLLM-20B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AksaraLLM/AksaraLLM-20B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AksaraLLM/AksaraLLM-20B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AksaraLLM/AksaraLLM-20B
- SGLang
How to use AksaraLLM/AksaraLLM-20B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AksaraLLM/AksaraLLM-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AksaraLLM/AksaraLLM-20B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AksaraLLM/AksaraLLM-20B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use AksaraLLM/AksaraLLM-20B with Docker Model Runner:
docker model run hf.co/AksaraLLM/AksaraLLM-20B
| # TRC v5p-128 Application β AksaraLLM 20B | |
| **Apply at:** https://sites.research.google/trc/about/ (click "Apply now"), or reply to your existing TRC onboarding email thread with the upgrade request. | |
| **Recommended ask:** v5p-128 preemptible, 6 weeks, `europe-west4-a` (same zone as your current `aksara-20b-v6e-8`, keeps data-locality with `gs://aksarallm20b-eu/`). | |
| --- | |
| ## Email body (copy-paste, edit the `[bracketed]` bits) | |
| > **Subject:** TRC upgrade request β v5p-128 for AksaraLLM 20B Indonesian pretrain (from v6e-8 current) | |
| > | |
| > Hi TRC team, | |
| > | |
| > I'm currently using `aksara-20b-v6e-8` (europe-west4-a) under TRC and would like to request an upgrade to **v5p-128 (preemptible, 6 weeks, europe-west4-a)** for the pretrain phase of **AksaraLLM 20B**, a from-scratch Indonesian-first LLM. v6e-8 is sufficient for smoke tests and SFT but gives a 6β9 month wall-clock for a 20B pretrain on ~400β600B tokens, whereas v5p-128 lands that in 4β5 weeks at healthy MFU. | |
| > | |
| > **Project:** AksaraLLM 20B β a LLaMA-3-style decoder-only transformer (GQA 48q/8kv, RoPE ΞΈ=1M, SwiGLU, RMSNorm, tied embeddings) targeting Indonesian, Malay, Javanese, and Sundanese with English and code as secondary. Dense 20.36B params, 8,192 train context extending to 131,072 at inference via YaRN. | |
| > | |
| > **Readiness evidence (already built on v6e-8):** | |
| > - Tokenizer live at https://huggingface.co/Ezekiel999/aksara-tokenizer-20b β 131,072 BPE vocab, fertility id=1.357, en=1.280, ms=1.368, jv=1.657 (all below targets) | |
| > - Pretrain runner (EasyDeL / JAX / Flax NNX, SPMD mesh, Orbax checkpointing, W&B) validated end-to-end on v6e-8: 20-step smoke test with loss decreasing 11.83β11.61 at ~39k tok/s on a 200M proxy model, corpus streamed from `gs://aksarallm20b-eu/smoke_parquet/` | |
| > - Corpus build pipeline (FineWeb + FineWeb-2-id + CulturaX + Indo4B + Dolma + The-Stack-v2, with fastText LID, Gopher quality filters, MinHash-LSH dedup, 13-gram decontamination against IndoMMLU/xCOPA/XNLI-id/TyDiQA-id/MMLU/HellaSwag/ARC/GSM8K) is in code; we will use v6e-8 to produce the 400β600B-token Parquet corpus under `gs://aksarallm20b-eu/pretrain/` while we wait for v5p. | |
| > - GCP project `aksarallm-tpu`, co-located EU bucket `gs://aksarallm20b-eu/` (12.16 GB sample corpus already uploaded) | |
| > - Repository: https://github.com/cahyohackids/AksaraLLM (branch `devin/1776993538-20b-pipeline-fixes`) | |
| > | |
| > **Compute plan for v5p-128:** | |
| > - Phase 1 pretrain: 200k steps Γ 2 Mi tokens/step = 419B tokens at 8k context, ~4.5 weeks wall-clock at ~45% MFU | |
| > - Phase 2 YaRN context extension: 10k steps at 32k context, ~4 days | |
| > - Eval + smoke SFT validation: 2 days | |
| > | |
| > **Recovery plan for preemption:** Orbax async sharded checkpoints every 500 steps (βΌ1h) to `gs://aksarallm20b-eu/ckpt/`, automatic resume. Expected preempt cost under 10% of wall-clock. | |
| > | |
| > **Open-source deliverables:** Apache-2.0 base weights, SFT+DPO variants, technical report on Hugging Face `AksaraLLM/` org. First sizable Indonesian from-scratch 20B, explicitly covering JV/SU/MS tails that are underrepresented in current multilingual models. | |
| > | |
| > Grateful for the v6e-8 access so far β the readiness work above was all done on it. Happy to share W&B run logs for the smoke test if useful. | |
| > | |
| > Thanks, | |
| > [Your name] | |
| > [Affiliation / lab / company] | |
| > GitHub: https://github.com/cahyohackids | |
| > Hugging Face: https://huggingface.co/AksaraLLM | |
| --- | |
| ## Readiness packet (attach or link in the email) | |
| | Artifact | Link / Location | | |
| |---|---| | |
| | Tokenizer | https://huggingface.co/Ezekiel999/aksara-tokenizer-20b | | |
| | Architecture config | `configs/aksara_20b_dense.json` on branch | | |
| | Pretrain runner | `scripts/train_20b_pretrain.py` on branch | | |
| | Corpus builder | `scripts/build_pretrain_corpus_v2.py` on branch | | |
| | Preflight gates | `scripts/preflight_20b.py` on branch | | |
| | Execution plan | `docs/aksara_20b_execution_plan.md` on branch | | |
| | Smoke-test log excerpt | `step=0 loss=11.83 tok/s=33k`, `step=10 loss=11.61 tok/s=40k`, clean exit | | |
| | Current TPU | `aksara-20b-v6e-8`, europe-west4-a, READY | | |
| | Bucket (co-located) | `gs://aksarallm20b-eu/` (12.16 GB sample corpus + tokenizer + smoke parquet) | | |
| --- | |
| ## Tips for approval | |
| 1. **Emphasize Indonesian-first + underrepresented SEA languages.** TRC is more likely to approve open-science projects serving underrepresented languages than yet-another-English-LLM. | |
| 2. **Show the work is already ready to run** β you have the tokenizer, the runner, and a validated smoke test. The ask is scale-out, not research. | |
| 3. **Preemptible is easier to get approved than on-demand.** The runner already has resume logic so this is OK. | |
| 4. **6 weeks is the honest ask.** Asking for 12 weeks will get declined or trimmed; 4 weeks is too tight to include margin for preempt & YaRN phase. | |
| 5. **Co-locate with europe-west4-a.** You already have `aksara-20b-v6e-8` there and `gs://aksarallm20b-eu/`. Don't ask for us-east or us-central β the TRC team prefers not to spread one project across zones. | |