DeepSeek-R1-Distill-Qwen-7B — Smart Contract Vulnerability Detection
RL fine-tuned version of DeepSeek-R1-Distill-Qwen-7B for detecting vulnerabilities in Solidity smart contracts. Fine-tuned using GRPO (Group Relative Policy Optimization) with LoRA on the CGT (Consolidated Ground Truth) dataset.
Model Description
- Base model: deepseek-ai/DeepSeek-R1-Distill-Qwen-7B
- Fine-tuning method: GRPO + LoRA
- Task: Smart contract vulnerability detection and classification
- Developer: Nishant Pandav (@npanium)
- Repository: https://github.com/npanium/smartcontracts-vulnerability-r1
- License: MIT
What it does
Given a Solidity smart contract, the model:
- Reasons through the code using chain-of-thought inside
<think>tags - Determines whether the contract is vulnerable
- Classifies the vulnerability by DASP category (1–9) and SWC ID (100–136)
Output format:
<think>
... reasoning about the contract ...
</think>
VULNERABLE: yes/no
DASP_CATEGORY: N
SWC_ID: NNN
EXPLANATION: ...
Results
Evaluated on 1,478 held-out contracts from the CGT dataset:
| Metric | Before (base) | After (fine-tuned) | Delta |
|---|---|---|---|
| Detection accuracy (Tier 1) | 23.0% | 74.3% | +51.3% |
| DASP category (Tier 2) | 8.8% | 11.3% | +2.5% |
| SWC ID (Tier 3) | 3.0% | 0.0% | -3.0% |
| Overall | 11.6% | 28.5% | +16.9% |
| Parse failure rate | 40.2% | 0.0% | -40.2% |
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model_id = "npanium/deepseek-r1-qwen7b-smartcontract-grpo"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto",
)
contract = """
pragma solidity ^0.4.18;
contract Vulnerable {
mapping(address => uint) public balances;
function withdraw(uint _amount) public {
if (balances[msg.sender] >= _amount) {
msg.sender.call.value(_amount)();
balances[msg.sender] -= _amount;
}
}
}
"""
prompt = f"""Analyze this Solidity smart contract for security vulnerabilities.
Think step by step inside <think> tags, then provide your assessment.
``solidity
{contract}``
Use this exact format:
VULNERABLE: yes/no
DASP_CATEGORY: [1-9]
SWC_ID: [100-136]
EXPLANATION: [one sentence]"""
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
messages,
return_tensors="pt",
add_generation_prompt=True,
)
if hasattr(inputs, "input_ids"):
inputs = inputs.input_ids
inputs = inputs.to(model.device)
with torch.no_grad():
outputs = model.generate(
inputs,
max_new_tokens=1024,
temperature=0.1,
do_sample=True,
pad_token_id=tokenizer.eos_token_id,
)
response = tokenizer.decode(
outputs[0][inputs.shape[1]:],
skip_special_tokens=True,
)
print(response)
Training Details
Dataset
CGT (Consolidated Ground Truth) — github.com/gsalzer/cgt
Consolidates 13 prior smart contract vulnerability datasets. Labels cross-validated across source datasets.
| Split | Examples |
|---|---|
| Train | 5,910 |
| Test (locked) | 1,478 |
Training Hyperparameters
| Parameter | Value |
|---|---|
| Training regime | bf16 mixed precision |
| Learning rate | 5e-6 |
| LoRA rank | 16 |
| LoRA alpha | 32 |
| LoRA target modules | q_proj, v_proj, k_proj, o_proj |
| Generations per prompt | 8 |
| Max completion length | 1024 |
| Gradient accumulation steps | 8 |
| Epochs | 1 |
Hardware
- GPU: NVIDIA A100 SXM4 80GB
- Cloud provider: Fluence Network
- Training duration: ~48 hours
Limitations
Outcome reward only. The reward function validates whether the final label is correct, not whether the reasoning is valid. The model may produce plausible-sounding analysis that doesn't actually justify the conclusion.
SWC ID regression. Post-training SWC ID accuracy dropped to zero. The model prioritised the higher-weighted binary detection reward at the expense of fine-grained weakness classification.
Context window. Contracts exceeding ~4,000 characters were excluded from training. Performance on very large contracts is untested.
Citation
@misc{pandav2026scvulnrl,
author = {Pandav, Nishant},
title = {Smart Contract Vulnerability Detection via RL Fine-Tuning},
year = {2026},
url = {https://github.com/npanium/smartcontracts-vulnerability-r1}
}
- Downloads last month
- 12
Model tree for npanium/deepseek-r1-qwen7b-smartcontract-grpo
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-7B