SWE-Gym Qwen2.5-Coder-7B-Instruct LoRA (64K Context)

A LoRA adapter fine-tuned on Qwen2.5-Coder-7B-Instruct using SWE-agent trajectory data distilled from Qwen3-Coder-480B-A35B-Instruct on the SWE-Gym dataset.

Model Details

Property Value
Base Model Qwen/Qwen2.5-Coder-7B-Instruct
Fine-tuning Method LoRA (Low-Rank Adaptation)
LoRA Rank (r) 8
LoRA Alpha 16
LoRA Dropout 0.0
Target Modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Max Context Length 64K tokens
Precision bfloat16

Training Details

Property Value
Training Data 634 resolved SWE-Gym instances
Teacher Model Qwen3-Coder-480B-A35B-Instruct
Agent Framework OpenHands CodeActAgent
Epochs 3
Total Steps 60
Batch Size 1 per device × 8 GPUs × 4 grad accum = 32 effective
Learning Rate 1e-4 (cosine schedule, 10% warmup)
Optimizer AdamW (β1=0.9, β2=0.999, ε=1e-8)
Final Training Loss 0.379
Training Runtime ~2.5 hours
Framework LLaMA-Factory + DeepSpeed
PEFT Version 0.18.1
Transformers 5.2.0
PyTorch 2.6.0

Training Data

The training data consists of 634 resolved instances from the SWE-Gym training set. Trajectories were generated by running Qwen3-Coder-480B-A35B-Instruct (via OpenHands CodeActAgent with maxiter=100) on SWE-Gym tasks, then filtering to only resolved (successful) trajectories. Function-calling messages were converted to non-function-calling format for SFT, and trajectories exceeding 64K tokens were excluded.

Training Curve

Step Epoch Loss Learning Rate
5 0.25 0.526 6.67e-05
10 0.50 0.497 9.92e-05
20 1.00 0.444 8.64e-05
30 1.50 0.402 6.15e-05
40 2.00 0.389 3.29e-05
50 2.50 0.370 9.89e-06
60 3.00 0.379 8.46e-08

Usage

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-Coder-7B-Instruct",
    torch_dtype="auto",
    device_map="auto",
)
model = PeftModel.from_pretrained(base_model, "MMR115/swegym-qwen2.5-coder-7b-instruct-lora-64k")
tokenizer = AutoTokenizer.from_pretrained("MMR115/swegym-qwen2.5-coder-7b-instruct-lora-64k")

Citation

If you use this model, please cite SWE-Gym and OpenHands:

@article{pan2024swegym,
  title={Training Software Engineering Agents and Verifiers with SWE-Gym},
  author={Pan, Jiayi and Xiao, Xingyao and Wang, Jinda and Graham, Colin and Wang, Xinran and Hu, Hoang and Wang, Rui and Shi, Heng and Liu, Pengfei and Wang, Huan and Qian, Cong},
  journal={ICML},
  year={2025}
}
  • Tokenizers 0.22.2
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for MMR115/swegym-qwen2.5-coder-7b-instruct-lora-64k

Base model

Qwen/Qwen2.5-7B
Adapter
(571)
this model

Dataset used to train MMR115/swegym-qwen2.5-coder-7b-instruct-lora-64k