TÜLU 3: Pushing Frontiers in Open Language Model Post-Training
Paper • 2411.15124 • Published • 67
This quantized model was created using AutoAWQ version 0.2.8 with quant_config:
{
"zero_point": True,
"q_group_size": 128,
"w_bit": 4,
"version": "GEMM"
}
Tülu3 is a leading instruction following model family, offering fully open-source data, code, and recipes designed to serve as a comprehensive guide for modern post-training techniques. Tülu3 is designed for state-of-the-art performance on a diversity of tasks in addition to chat, such as MATH, GSM8K, and IFEval.
| Stage | Llama 3.1 8B | Llama 3.1 70B |
|---|---|---|
| Base Model | meta-llama/Llama-3.1-8B | meta-llama/Llama-3.1-70B |
| SFT | allenai/Llama-3.1-Tulu-3-8B-SFT | allenai/Llama-3.1-Tulu-3-70B-SFT |
| DPO | allenai/Llama-3.1-Tulu-3-8B-DPO | allenai/Llama-3.1-Tulu-3-70B-DPO |
| Final Models (RLVR) | allenai/Llama-3.1-Tulu-3-8B | allenai/Llama-3.1-Tulu-3-70B |
| Reward Model (RM) | allenai/Llama-3.1-Tulu-3-8B-RM | (Same as 8B) |
| Stage | Llama 3.1 405B |
|---|---|
| Base Model | meta-llama/llama-3.1-405B |
| SFT | allenai/llama-3.1-Tulu-3-405B-SFT |
| DPO | allenai/llama-3.1-Tulu-3-405B-DPO |
| Final Model (RLVR) | allenai/llama-3.1-Tulu-3-405B |
| Reward Model (RM) | (Same as 8B) |
To load the model with HuggingFace, use the following snippet:
from transformers import AutoModelForCausalLM
tulu_model = AutoModelForCausalLM.from_pretrained("allenai/Llama-3.1-Tulu-3-70B")
As a Llama base model, the model can be easily served with:
vllm serve allenai/Llama-3.1-Tulu-3-70B
Note that given the long chat template of Llama, you may want to use --max_model_len=8192.
The chat template for our models is formatted as:
<|user|>\nHow are you doing?\n<|assistant|>\nI'm just a computer progr
Base model
meta-llama/Llama-3.1-70B