LIMO-Qwen3-8B-Math-Merged

This is the merged version of the LIMO fine-tuned Qwen3-8B model for mathematical reasoning.

Model Details

Base Model: Qwen3-8B
Training Method: LoRA fine-tuning with Unsloth (now merged)
Dataset: GAIR/LIMO (817 high-quality samples)
Model Type: Full merged model (not adapter)

Key Features

✅ Full merged model - no need to load base model + adapter
✅ Ready for inference - works with standard transformers
✅ Compatible with benchmarking tools - works with lighteval
✅ Mathematical reasoning optimized - trained on LIMO dataset

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForCausalLM.from_pretrained(
    "Cbgcbg/limo-qwen3-8b-math-merged",
    torch_dtype=torch.bfloat16,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(
    "Cbgcbg/limo-qwen3-8b-math-merged",
    trust_remote_code=True
)

# Format input
messages = [
    {"role": "system", "content": "Please reason step by step, and put your final answer within \\boxed{}."},
    {"role": "user", "content": "What is the sum of the first 10 positive integers?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

# Generate
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0][len(inputs.input_ids[0]):], skip_special_tokens=True)
print(response)

Training Configuration

Batch Size: 8
Learning Rate: 2e-5
Epochs: 3
LoRA Rank: 16
LoRA Alpha: 32
LoRA Dropout: 0.1

Performance

This merged model maintains the same mathematical reasoning capabilities as the LoRA adapter but is easier to use and compatible with more tools.

Comparison with LoRA Version

Feature	LoRA Adapter	Merged Model
Model Size	~100MB	~16GB
Loading Speed	Requires base model	Direct loading
Compatibility	Limited	Full transformers support
Benchmarking	Needs special handling	Works with lighteval

Citation

@misc{ye2025limoreasoning,
    title={LIMO: Less is More for Reasoning},
    author={Yixin Ye and Zhen Huang and Yang Xiao and Ethan Chern and Shijie Xia and Pengfei Liu},
    year={2025},
    eprint={2502.03387},
    archivePrefix={arXiv},
    primaryClass={cs.CL},
    url={https://arxiv.org/abs/2502.03387},
}