Gravity-16B-A3B-Preview

Gravity-16B-A3B-Preview

Gravity-16B-A3B-Preview is a post-trained language model built on Gravity-16B-A3B-Base by Trillion Labs. Starting from the base model, it underwent context length extension (32K โ†’ 128K), supervised fine-tuning (SFT), and reinforcement learning (GRPO) focused on science and code.

This is a preview release offering a strong balance of capability, efficiency, and long-context support for its size. We are actively working on agentic capabilities for the full release.

Model Summary

Property Value
Base Model Gravity-16B-A3B-Base
Total Parameters 16.24B
Active Parameters 3.16B
Architecture GravityMoE
Context Length 131,072 tokens (128K)
Precision bf16
License Apache 2.0

For full architectural details (MLA, MoE routing, tokenizer, etc.), see the base model card.

Post-Training Pipeline

Starting from Gravity-16B-A3B-Base (pretrained on ~5.5T tokens):

  1. Context Length Extension โ€” Extended from 32K to 128K tokens.
  2. Supervised Fine-Tuning (SFT) โ€” Instruction tuning for general chat and task-following capabilities.
  3. Reinforcement Learning (GRPO) โ€” Single-step Group Relative Policy Optimization focused on science and code domains.

Agentic RL and multi-turn RL stages are in progress and will be included in future releases.

Evaluation Results

Category Benchmark Metric Score
Math AIME 2024 acc 43.3
GSM8K acc 91.8
MATH500 acc 88.6
Code HumanEval pass@1 89.0
MBPP pass@1 96.0
LiveCodeBench V6 pass@1 41.0
Knowledge MMLU acc 80.1
MMLU-Pro acc 71.5
BBH acc 79.24
Science GPQA Diamond acc 55.1
Arc Challenge acc 92.32
ChemBench acc 68.6
Molang Bench (Editing) SMILEs validty / Tanimoto similarity / Accuracy 70.83 / 86.43 / 43.23
Molang Bench (Generation) SMILEs validty / Tanimoto similarity / Accuracy 35.96 / 43.24 / 1.69
Instruction Following IFEval instruct level loose 84.53
IFBench instruct level loose 46.51
Agentic Tau^2 (Telecom) pass@1 71.93
Scicode sub problem level 18.8
Terminal Bench pass@1 21.25
Long Context AA-LCR pass@1 21.0

Comparison with Moonlight-16B-A3B-Instruct

Category Benchmark Metric Gravity-16B-A3B-Preview Moonlight-16B-A3B-Instruct
Math GSM8K acc 91.8 77.4
Code HumanEval pass@1 89.0 48.1
MBPP pass@1 96.0 63.8
Knowledge MMLU acc 80.1 70.0
MMLU-Pro acc 71.5 42.4
BBH acc 79.24 65.2

Note: We include Moonlight-16B-A3B-Instruct for comparison since it is similar in size to our model. Moonlight-16B-A3B-Instruct scores are taken from the numbers reported in their own technical report.

With 3.16B active parameters, 128K context, and broad coverage across math, code, and knowledge benchmarks, the model offers a strong balance of capability and efficiency for its size.

Agentic benchmarks (multi-step tool use, code execution) are not yet a focus of this release. We are actively training on agentic tasks and will include those results in the next release.

Quickstart

Installation

pip install "transformers>=5.0" torch

Using Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model_name = "trillionlabs/Gravity-16B-A3B-Preview"

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16,
    device_map="auto",
)

messages = [
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Solve the equation: x^3 - 6x^2 + 11x - 6 = 0"},
]

input_ids = tokenizer.apply_chat_template(
    messages, add_generation_prompt=True, return_tensors="pt"
).to(model.device)

output = model.generate(input_ids, max_new_tokens=1024, do_sample=True, temperature=0.7)
print(tokenizer.decode(output[0][input_ids.shape[-1]:], skip_special_tokens=True))

Deployment

Note: We are working on upstreaming native GravityMoE support to SGLang. Until the PR is merged, please use the installation steps below.

SGLang

Install SGLang from the sglang-gravity fork:

pip install "sglang[all] @ git+https://github.com/trillion-labs/sglang-gravity.git#subdirectory=python"

Launch the server:

python3 -m sglang.launch_server \
    --model-path trillionlabs/Gravity-16B-A3B-Preview \
    --host 0.0.0.0 \
    --port 30000 \
    --tp 8 \
    --trust-remote-code \
    --moe-runner-backend triton \
    --tool-call-parser glm45 \
    --reasoning-parser glm45 \
    --dtype bfloat16

Send a request:

curl http://localhost:30000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "trillionlabs/Gravity-16B-A3B-Preview",
    "messages": [{"role": "user", "content": "What is the capital of South Korea?"}],
    "max_tokens": 128,
    "temperature": 0.7
  }'

Limitations

  • This is a preview release. Agentic and multi-turn capabilities are under active development.
  • The model may generate factually incorrect, biased, or harmful content.
  • Performance may degrade on languages not well-represented in the training data.

Acknowledgements

This model was developed as part of a collaborative research initiative led by Lunit and Trillion Labs, with a focus on advancing foundation models for science and healthcare.

  • Lunit โ€” Project lead and medical AI research
  • Trillion Labs โ€” Model architecture, pretraining, and infrastructure
  • Aigen Science โ€” Biomedical AI and drug discovery research
  • SK Biopharmaceuticals โ€” AI-driven drug development and digital healthcare advisory
  • Kakao Healthcare โ€” Medical data standardization and platform support

We also thank the following participating institutions for their contributions: KAIST (Yoonjae Choi, Taekyun Kim, Jong Chul Ye, Hyunwoo Kim, Seunghoon Hong), Seoul National University (Yousung Jung), Rebellions, Standigm, NHIS Ilsan Hospital, Yongin Severance Hospital, Gangdong Kyung Hee University Hospital, Kyung Hee University Medical Center, Korea University, Konyang University Hospital, Ewha Womans University Seoul Hospital, Keimyung University Dongsan Medical Center, Pusan National University Yangsan Hospital, and D-Circle.

This work was supported by the AI Specialized Foundation Model Project (์ธ๊ณต์ง€๋Šฅ ํŠนํ™” ํŒŒ์šด๋ฐ์ด์…˜ ๋ชจ๋ธ ํ”„๋กœ์ ํŠธ), funded by the Ministry of Science and ICT (๊ณผํ•™๊ธฐ์ˆ ์ •๋ณดํ†ต์‹ ๋ถ€, MSIT) and managed by the National IT Industry Promotion Agency (NIPA, ์ •๋ณดํ†ต์‹ ์‚ฐ์—…์ง„ํฅ์›).

License

This model is released under the Apache 2.0 License.

Citation

@misc{gravity-preview-2026,
    title={Gravity-16B-A3B-Preview},
    author={Trillion Labs},
    year={2026},
    url={https://huggingface.co/trillionlabs/Gravity-16B-A3B-Preview}
}

Contact

Downloads last month
381
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for trillionlabs/Gravity-16B-A3B-Preview

Finetuned
(2)
this model