Gravity-16B-A3B-Preview

Gravity-16B-A3B-Preview is a post-trained language model built on Gravity-16B-A3B-Base by Trillion Labs. Starting from the base model, it underwent context length extension (32K → 128K), supervised fine-tuning (SFT), and reinforcement learning (GRPO) focused on science and code.

This is a preview release offering a strong balance of capability, efficiency, and long-context support for its size. We are actively working on agentic capabilities for the full release.

Model Summary

Property	Value
Base Model	Gravity-16B-A3B-Base
Total Parameters	16.24B
Active Parameters	3.16B
Architecture	GravityMoE
Context Length	131,072 tokens (128K)
Precision	bf16
License	Apache 2.0

For full architectural details (MLA, MoE routing, tokenizer, etc.), see the base model card.

Post-Training Pipeline

Starting from Gravity-16B-A3B-Base (pretrained on ~5.5T tokens):

Context Length Extension — Extended from 32K to 128K tokens.
Supervised Fine-Tuning (SFT) — Instruction tuning for general chat and task-following capabilities.
Reinforcement Learning (GRPO) — Single-step Group Relative Policy Optimization focused on science and code domains.

Agentic RL and multi-turn RL stages are in progress and will be included in future releases.

Evaluation Results

Category	Benchmark	Metric	Score
Math	AIME 2024	acc	43.3
	GSM8K	acc	91.8
	MATH500	acc	88.6
Code	HumanEval	pass@1	89.0
	MBPP	pass@1	96.0
	LiveCodeBench V6	pass@1	41.0
Knowledge	MMLU	acc	80.1
	MMLU-Pro	acc	71.5
	BBH	acc	79.24
Science	GPQA Diamond	acc	55.1
	Arc Challenge	acc	92.32
	ChemBench	acc	68.6
	Molang Bench (Editing)	SMILEs validty / Tanimoto similarity / Accuracy	70.83 / 86.43 / 43.23
	Molang Bench (Generation)	SMILEs validty / Tanimoto similarity / Accuracy	35.96 / 43.24 / 1.69
Instruction Following	IFEval	instruct level loose	84.53
	IFBench	instruct level loose	46.51
Agentic	Tau^2 (Telecom)	pass@1	71.93
	Scicode	sub problem level	18.8
	Terminal Bench	pass@1	21.25
Long Context	AA-LCR	pass@1	21.0

Comparison with Moonlight-16B-A3B-Instruct

Category	Benchmark	Metric	Gravity-16B-A3B-Preview	Moonlight-16B-A3B-Instruct
Math	GSM8K	acc	91.8	77.4
Code	HumanEval	pass@1	89.0	48.1
	MBPP	pass@1	96.0	63.8
Knowledge	MMLU	acc	80.1	70.0
	MMLU-Pro	acc	71.5	42.4
	BBH	acc	79.24	65.2

Note: We include Moonlight-16B-A3B-Instruct for comparison since it is similar in size to our model. Moonlight-16B-A3B-Instruct scores are taken from the numbers reported in their own technical report.

With 3.16B active parameters, 128K context, and broad coverage across math, code, and knowledge benchmarks, the model offers a strong balance of capability and efficiency for its size.

Agentic benchmarks (multi-step tool use, code execution) are not yet a focus of this release. We are actively training on agentic tasks and will include those results in the next release.

Quickstart

Installation

pip install "transformers>=5.0" torch

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "trillionlabs/Gravity-16B-A3B-Preview"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Solve the equation: x^3 - 6x^2 + 11x - 6 = 0"},
]

input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

output = model.generate(input_ids, max_new_tokens=1024, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Deployment

Note: We are working on upstreaming native GravityMoE support to SGLang. Until the PR is merged, please use the installation steps below.

SGLang

Install SGLang from the sglang-gravity fork:

pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python"

Launch the server:

python3 -m sglang.launch_server \
    --model-path trillionlabs/Gravity-16B-A3B-Preview \
    --host 0.0.0.0 \
    --port 30000 \
    --tp 8 \
    --trust-remote-code \
    --moe-runner-backend triton \
    --tool-call-parser glm45 \
    --reasoning-parser glm45 \
    --dtype bfloat16

Send a request:

curl http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "trillionlabs/Gravity-16B-A3B-Preview",
    "messages": [{"role": "user", "content": "What is the capital of South Korea?"}],
    "max_tokens": 128,
    "temperature": 0.7
  }'

Limitations

This is a preview release. Agentic and multi-turn capabilities are under active development.
The model may generate factually incorrect, biased, or harmful content.
Performance may degrade on languages not well-represented in the training data.

Acknowledgements

This model was developed as part of a collaborative research initiative led by Lunit and Trillion Labs, with a focus on advancing foundation models for science and healthcare.

Lunit — Project lead and medical AI research
Trillion Labs — Model architecture, pretraining, and infrastructure
Aigen Science — Biomedical AI and drug discovery research
SK Biopharmaceuticals — AI-driven drug development and digital healthcare advisory
Kakao Healthcare — Medical data standardization and platform support

We also thank the following participating institutions for their contributions: KAIST (Yoonjae Choi, Taekyun Kim, Jong Chul Ye, Hyunwoo Kim, Seunghoon Hong), Seoul National University (Yousung Jung), Rebellions, Standigm, NHIS Ilsan Hospital, Yongin Severance Hospital, Gangdong Kyung Hee University Hospital, Kyung Hee University Medical Center, Korea University, Konyang University Hospital, Ewha Womans University Seoul Hospital, Keimyung University Dongsan Medical Center, Pusan National University Yangsan Hospital, and D-Circle.

This work was supported by the AI Specialized Foundation Model Project (인공지능 특화 파운데이션 모델 프로젝트), funded by the Ministry of Science and ICT (과학기술정보통신부, MSIT) and managed by the National IT Industry Promotion Agency (NIPA, 정보통신산업진흥원).

License

This model is released under the Apache 2.0 License.

Citation

@misc{gravity-preview-2026,
    title={Gravity-16B-A3B-Preview},
    author={Trillion Labs},
    year={2026},
    url={https://huggingface.co/trillionlabs/Gravity-16B-A3B-Preview}
}

Contact

Website: trillionlabs.co
Website: lunit.io

Downloads last month: 381

Model tree for trillionlabs/Gravity-16B-A3B-Preview

Base model

trillionlabs/Gravity-16B-A3B-Base

Finetuned

(2)

this model