💻 Qwen2.5 7B — Evol-Instruct Code (LoRA)

Fine-tuned by Alok Kumar Dubey | GitHub | LinkedIn | HuggingFace

A LoRA adapter fine-tuned on top of Qwen2.5-7B-Instruct using the Evol-Instruct-Code-80k dataset — the same dataset used to train WizardCoder. The model handles complex coding tasks including algorithm design, debugging, time/space complexity optimization, and multi-language code generation.

🚀 Quick Start

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name     = "alok21in/qwen2.5-7b-evol-code",
    max_seq_length = 1024,
    load_in_4bit   = True,
)
FastLanguageModel.for_inference(model)

prompt = """Below is a coding instruction. Write a response that solves the task.

### Instruction:
Write a Python function to find all prime numbers up to n using the Sieve of Eratosthenes.

### Response:"""

inputs  = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=300, pad_token_id=tokenizer.eos_token_id)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response.split("### Response:")[-1].strip())

📊 Example Output

Input:

Write a function to check if two strings are anagrams. Optimize for time complexity.

Output:

def are_anagrams(s1: str, s2: str) -> bool:
    if len(s1) != len(s2):
        return False
    char_count = {}
    for char in s1:
        char_count[char] = char_count.get(char, 0) + 1
    for char in s2:
        if char not in char_count or char_count[char] == 0:
            return False
        char_count[char] -= 1
    return True
# Time complexity: O(n), Space complexity: O(k) where k = unique chars

🏋️ Training Details

Parameter	Value
Base Model	Qwen/Qwen2.5-7B-Instruct
Dataset	nickrosh/Evol-Instruct-Code-80k-v1
Training Samples	5,000 of 80,000
LoRA Rank (r)	8
LoRA Alpha	16
Target Modules	q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantization	4-bit (QLoRA)
Max Seq Length	1024
Batch Size	1 (effective 8 with grad accumulation)
Learning Rate	2e-4
Steps	60
Optimizer	adamw_8bit
Hardware	Kaggle T4 GPU (15GB)
Framework	Unsloth + TRL

📋 About the Dataset

nickrosh/Evol-Instruct-Code-80k-v1 is the open-source implementation of the dataset behind WizardCoder, one of the top-performing code generation models. It contains 80k instruction-response pairs evolved from Code Alpaca through 3 rounds of complexity increases — covering algorithm design, debugging, optimization, and multi-language coding tasks.