YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Qwen2.5-1.5B-Instruct-CodeGen-Renamed

This is a fine-tuned version of Qwen/Qwen2.5-1.5B-Instruct trained on competitive programming problems with obfuscated variable names.

Model Details

  • Base Model: Qwen2.5-1.5B-Instruct
  • Training Method: LoRA (Low-Rank Adaptation)
  • LoRA Config:
    • Rank: 16
    • Alpha: 32
    • Dropout: 0.05
    • Target: all modules
  • Training Dataset: Code Generation with Renamed/Obfuscated Variables (53,784 samples)
  • Training Epochs: 1
  • Learning Rate: 2e-4
  • Optimizer: AdamW
  • Precision: bfloat16
  • Training Features: Chain-of-thought reasoning enabled

Intended Use

This model has been fine-tuned for code generation and competitive programming tasks. Unlike the original version, this model was trained on code with obfuscated variable names (e.g., int_1, var_1) to potentially improve generalization.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "FearandDreams/Qwen2.5-1.5B-Instruct-CodeGen-Renamed"
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="auto", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Example: Solving a competitive programming problem
problem = """You are a helpful and harmless assistant. You should think step-by-step before responding to the instruction below.

Please use python programming language only.

You must use ```python for just the final solution code block with the following format:
```python
# Your code here
(put ``` there, I can't directly use ``` here for readme)
{question}

Input: "" (keep empty, put the question to problem)

Output: .........."""

messages = [
    {"role": "system", "content": "You are Qwen, created by Alibaba Cloud. You are a helpful assistant."},
    {"role": "user", "content": problem}
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)

outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.2, top_p=0.95)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)

The model will generate a solution with (expected):

  1. A <think> section with step-by-step reasoning
  2. A final Python code solution (may use obfuscated variable names like int_1, var_1)

Training Details

  • Trained using LLaMA-Factory
  • Context Length: 16,384 tokens
  • Batch Size: 2 per device
  • Gradient Accumulation Steps: 8

License

This model inherits the license from the base Qwen2.5-1.5B-Instruct model.

Citation

@misc{qwen2.5-1.5b-codegen-renamed,
  author = {FearandDreams},
  title = {Qwen2.5-1.5B-Instruct Fine-tuned on Competitive Programming (Obfuscated Variables)},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/FearandDreams/Qwen2.5-1.5B-Instruct-CodeGen-Renamed}},
}
Downloads last month
1
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support