SWE-Gym Qwen2.5-Coder-32B-Instruct Full SFT (64K Context)

A full-parameter supervised fine-tuned version of Qwen2.5-Coder-32B-Instruct using SWE-agent trajectory data distilled from Qwen3-Coder-480B-A35B-Instruct on the SWE-Gym dataset.

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-Coder-32B-Instruct
Fine-tuning Method	Full-parameter SFT
Parameters	~32.8B
Architecture	Qwen2ForCausalLM
Hidden Size	5120
Num Layers	64
Num Attention Heads	40 (8 KV heads, GQA)
Intermediate Size	27648
Max Context Length	64K tokens
Precision	bfloat16

Training Details

Property	Value
Training Data	634 resolved SWE-Gym instances
Teacher Model	Qwen3-Coder-480B-A35B-Instruct
Agent Framework	OpenHands CodeActAgent
Epochs	3
Total Steps	60
Batch Size	1 per device × 8 GPUs × 4 grad accum = 32 effective
Learning Rate	1e-5 (cosine schedule, 10% warmup)
Optimizer	AdamW (β1=0.9, β2=0.999, ε=1e-8)
Final Training Loss	0.269
Training Runtime	~6.0 hours
Framework	LLaMA-Factory + DeepSpeed
Transformers	5.2.0
PyTorch	2.6.0

Training Data

The training data consists of 634 resolved instances from the SWE-Gym training set. Trajectories were generated by running Qwen3-Coder-480B-A35B-Instruct (via OpenHands CodeActAgent with maxiter=100) on SWE-Gym tasks, then filtering to only resolved (successful) trajectories. Function-calling messages were converted to non-function-calling format for SFT, and trajectories exceeding 64K tokens were excluded.

Training Curve

Step	Epoch	Loss	Learning Rate
5	0.25	0.489	6.67e-06
10	0.50	0.369	9.92e-06
15	0.75	0.317	9.47e-06
20	1.00	0.280	8.64e-06
25	1.25	0.255	7.50e-06
30	1.50	0.234	6.15e-06
35	1.75	0.231	4.71e-06
40	2.00	0.226	3.29e-06
45	2.25	0.208	2.01e-06
50	2.50	0.202	9.89e-07
55	2.75	0.211	3.02e-07
60	3.00	0.206	8.46e-09

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "MMR115/swegym-qwen2.5-coder-32b-instruct-sft-64k",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("MMR115/swegym-qwen2.5-coder-32b-instruct-sft-64k")

Citation

If you use this model, please cite SWE-Gym and OpenHands:

@article{pan2024swegym,
  title={Training Software Engineering Agents and Verifiers with SWE-Gym},
  author={Pan, Jiayi and Xiao, Xingyao and Wang, Jinda and Graham, Colin and Wang, Xinran and Hu, Hoang and Wang, Rui and Shi, Heng and Liu, Pengfei and Wang, Huan and Qian, Cong},
  journal={ICML},
  year={2025}
}

Downloads last month: 2

Safetensors

Model size

1.12M params

Tensor type

BF16

Model tree for MMR115/swegym-qwen2.5-coder-32b-instruct-sft-64k

Base model

Qwen/Qwen2.5-32B

Finetuned

Qwen/Qwen2.5-Coder-32B

Finetuned

Qwen/Qwen2.5-Coder-32B-Instruct

Finetuned

(124)

this model

MMR115
/

swegym-qwen2.5-coder-32b-instruct-sft-64k

SWE-Gym Qwen2.5-Coder-32B-Instruct Full SFT (64K Context)

Model Details

Training Details

Training Data

Training Curve

Usage

Citation

Model tree for MMR115/swegym-qwen2.5-coder-32b-instruct-sft-64k

Dataset used to train MMR115/swegym-qwen2.5-coder-32b-instruct-sft-64k