Gravity-16B-A3B-Preview
Gravity-16B-A3B-Preview is a post-trained language model built on Gravity-16B-A3B-Base by Trillion Labs. Starting from the base model, it underwent context length extension (32K โ 128K), supervised fine-tuning (SFT), and reinforcement learning (GRPO) focused on science and code.
This is a preview release offering a strong balance of capability, efficiency, and long-context support for its size. We are actively working on agentic capabilities for the full release.
Model Summary
| Property | Value |
|---|---|
| Base Model | Gravity-16B-A3B-Base |
| Total Parameters | 16.24B |
| Active Parameters | 3.16B |
| Architecture | GravityMoE |
| Context Length | 131,072 tokens (128K) |
| Precision | bf16 |
| License | Apache 2.0 |
For full architectural details (MLA, MoE routing, tokenizer, etc.), see the base model card.
Post-Training Pipeline
Starting from Gravity-16B-A3B-Base (pretrained on ~5.5T tokens):
- Context Length Extension โ Extended from 32K to 128K tokens.
- Supervised Fine-Tuning (SFT) โ Instruction tuning for general chat and task-following capabilities.
- Reinforcement Learning (GRPO) โ Single-step Group Relative Policy Optimization focused on science and code domains.
Agentic RL and multi-turn RL stages are in progress and will be included in future releases.
Evaluation Results
| Category | Benchmark | Metric | Score |
|---|---|---|---|
| Math | AIME 2024 | acc | 43.3 |
| GSM8K | acc | 91.8 | |
| MATH500 | acc | 88.6 | |
| Code | HumanEval | pass@1 | 89.0 |
| MBPP | pass@1 | 96.0 | |
| LiveCodeBench V6 | pass@1 | 41.0 | |
| Knowledge | MMLU | acc | 80.1 |
| MMLU-Pro | acc | 71.5 | |
| BBH | acc | 79.24 | |
| Science | GPQA Diamond | acc | 55.1 |
| Arc Challenge | acc | 92.32 | |
| ChemBench | acc | 68.6 | |
| Molang Bench (Editing) | SMILEs validty / Tanimoto similarity / Accuracy | 70.83 / 86.43 / 43.23 | |
| Molang Bench (Generation) | SMILEs validty / Tanimoto similarity / Accuracy | 35.96 / 43.24 / 1.69 | |
| Instruction Following | IFEval | instruct level loose | 84.53 |
| IFBench | instruct level loose | 46.51 | |
| Agentic | Tau^2 (Telecom) | pass@1 | 71.93 |
| Scicode | sub problem level | 18.8 | |
| Terminal Bench | pass@1 | 21.25 | |
| Long Context | AA-LCR | pass@1 | 21.0 |
Comparison with Moonlight-16B-A3B-Instruct
| Category | Benchmark | Metric | Gravity-16B-A3B-Preview | Moonlight-16B-A3B-Instruct |
|---|---|---|---|---|
| Math | GSM8K | acc | 91.8 | 77.4 |
| Code | HumanEval | pass@1 | 89.0 | 48.1 |
| MBPP | pass@1 | 96.0 | 63.8 | |
| Knowledge | MMLU | acc | 80.1 | 70.0 |
| MMLU-Pro | acc | 71.5 | 42.4 | |
| BBH | acc | 79.24 | 65.2 |
Note: We include Moonlight-16B-A3B-Instruct for comparison since it is similar in size to our model. Moonlight-16B-A3B-Instruct scores are taken from the numbers reported in their own technical report.
With 3.16B active parameters, 128K context, and broad coverage across math, code, and knowledge benchmarks, the model offers a strong balance of capability and efficiency for its size.
Agentic benchmarks (multi-step tool use, code execution) are not yet a focus of this release. We are actively training on agentic tasks and will include those results in the next release.
Quickstart
Installation
pip install "transformers>=5.0" torch
Using Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_name = "trillionlabs/Gravity-16B-A3B-Preview"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
device_map="auto",
)
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Solve the equation: x^3 - 6x^2 + 11x - 6 = 0"},
]
input_ids = tokenizer.apply_chat_template(
messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)
output = model.generate(input_ids, max_new_tokens=1024, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))
Deployment
Note: We are working on upstreaming native GravityMoE support to SGLang. Until the PR is merged, please use the installation steps below.
SGLang
Install SGLang from the sglang-gravity fork:
pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python"
Launch the server:
python3 -m sglang.launch_server \
--model-path trillionlabs/Gravity-16B-A3B-Preview \
--host 0.0.0.0 \
--port 30000 \
--tp 8 \
--trust-remote-code \
--moe-runner-backend triton \
--tool-call-parser glm45 \
--reasoning-parser glm45 \
--dtype bfloat16
Send a request:
curl http://localhost:30000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "trillionlabs/Gravity-16B-A3B-Preview",
"messages": [{"role": "user", "content": "What is the capital of South Korea?"}],
"max_tokens": 128,
"temperature": 0.7
}'
Limitations
- This is a preview release. Agentic and multi-turn capabilities are under active development.
- The model may generate factually incorrect, biased, or harmful content.
- Performance may degrade on languages not well-represented in the training data.
Acknowledgements
This model was developed as part of a collaborative research initiative led by Lunit and Trillion Labs, with a focus on advancing foundation models for science and healthcare.
- Lunit โ Project lead and medical AI research
- Trillion Labs โ Model architecture, pretraining, and infrastructure
- Aigen Science โ Biomedical AI and drug discovery research
- SK Biopharmaceuticals โ AI-driven drug development and digital healthcare advisory
- Kakao Healthcare โ Medical data standardization and platform support
We also thank the following participating institutions for their contributions: KAIST (Yoonjae Choi, Taekyun Kim, Jong Chul Ye, Hyunwoo Kim, Seunghoon Hong), Seoul National University (Yousung Jung), Rebellions, Standigm, NHIS Ilsan Hospital, Yongin Severance Hospital, Gangdong Kyung Hee University Hospital, Kyung Hee University Medical Center, Korea University, Konyang University Hospital, Ewha Womans University Seoul Hospital, Keimyung University Dongsan Medical Center, Pusan National University Yangsan Hospital, and D-Circle.
This work was supported by the AI Specialized Foundation Model Project (์ธ๊ณต์ง๋ฅ ํนํ ํ์ด๋ฐ์ด์ ๋ชจ๋ธ ํ๋ก์ ํธ), funded by the Ministry of Science and ICT (๊ณผํ๊ธฐ์ ์ ๋ณดํต์ ๋ถ, MSIT) and managed by the National IT Industry Promotion Agency (NIPA, ์ ๋ณดํต์ ์ฐ์ ์งํฅ์).
License
This model is released under the Apache 2.0 License.
Citation
@misc{gravity-preview-2026,
title={Gravity-16B-A3B-Preview},
author={Trillion Labs},
year={2026},
url={https://huggingface.co/trillionlabs/Gravity-16B-A3B-Preview}
}
Contact
- Website: trillionlabs.co
- Website: lunit.io
- Downloads last month
- 381
Model tree for trillionlabs/Gravity-16B-A3B-Preview
Base model
trillionlabs/Gravity-16B-A3B-Base