Luna-2 Style β€” Prompt Injection Detector (LoRA Adapter)

Luna-2 Style LoRA adapter for Qwen2.5-0.5B-Instruct, fine-tuned for binary prompt-injection detection (yes / no).

This repository contains only the adapter weights (β‰ˆ a few MB). You need PEFT to use it. If you want a standalone checkpoint with no dependencies, use the merged model at aditya02acharya/luna2-qwen2.5-0.5b-prompt-injection-merged.

Quickstart (PEFT)

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base = AutoModelForCausalLM.from_pretrained(
    "Qwen/Qwen2.5-0.5B-Instruct",
    torch_dtype=torch.float16,
    device_map="auto",
)
model = PeftModel.from_pretrained(base, "aditya02acharya/luna2-qwen2.5-0.5b-prompt-injection-lora")
tokenizer = AutoTokenizer.from_pretrained("aditya02acharya/luna2-qwen2.5-0.5b-prompt-injection-lora")

messages = [
    {"role": "system", "content": "You are a prompt injection detector. Reply only with yes or no."},
    {"role": "user",   "content": "<text to classify>"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False,
                                     add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
out    = model.generate(**inputs, max_new_tokens=1, temperature=0, do_sample=False)
label  = tokenizer.decode(out[0, -1]).strip()  # "yes" or "no"

vLLM Deployment

vLLM supports LoRA adapters natively. Use the merged repo for simplest deployment, or load the adapter dynamically:

python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen2.5-0.5B-Instruct \
    --enable-lora \
    --lora-modules luna2=aditya02acharya/luna2-qwen2.5-0.5b-prompt-injection-lora \
    --max-lora-rank 16 \
    --max-model-len 4096 \
    --dtype float16

Training Details

Parameter Value
Base model Qwen/Qwen2.5-0.5B-Instruct
LoRA r / alpha 16 / 32
LoRA dropout 0.05
Target modules q/k/v/o_proj, gate/up/down_proj
Epochs 2
Effective batch 32 Γ— 2
Learning rate 0.0005
Max seq length 2048
Train samples 608,507
Resumed from checkpoint-9508
Train loss 0.2695
Trained on 2026-03-30

Evaluation

Test Set

Metric Value
Accuracy 0.9575
Precision 0.9776
Recall 0.9246
F1 0.9503
AUC-ROC 0.9934
Brier Score 0.0298
Optimal Threshold 0.45
Optimal F1 0.9509
Eval Samples 20,000

Validation Set

Metric Value
Accuracy 0.9576
Precision 0.9783
Recall 0.9235
F1 0.9501
AUC-ROC 0.9930
Brier Score 0.0301
Optimal Threshold 0.45
Optimal F1 0.9517
Eval Samples 50,000

License

Apache 2.0 β€” same as the base Qwen2.5 model.

Downloads last month
2
Inference Examples
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for aditya02acharya/luna2-qwen2.5-0.5b-prompt-injection-lora

Adapter
(503)
this model

Evaluation results