Sentence Selection ORPO v3
LoRA adapter for debate card cutting / sentence selection task.
Model Description
This is an ORPO-trained LoRA adapter for selecting relevant sentences from evidence to support debate claims. The model was trained on 507 DPO pairs using ORPO (Odds Ratio Preference Optimization).
Base Model: Qwen/Qwen3-30B-A3B (with SFT fine-tuning)
Training Details
- Method: ORPO (Odds Ratio Preference Optimization)
- Training Data: 507 DPO pairs (456 train, 51 validation)
- Learning Rate: 5e-7
- LoRA Rank: 32
- LoRA Alpha: 64
- Target Modules: q_proj, k_proj, v_proj, o_proj
- Epochs: ~0.35 (checkpoint at step 40)
Training Metrics
| Checkpoint | Epoch | Eval Accuracy | Eval Loss | Eval Margin |
|---|---|---|---|---|
| Step 10 | 0.09 | 64.7% | 2.481 | +0.023 |
| Step 20 | 0.18 | 64.7% | 2.481 | +0.023 |
| Step 30 | 0.26 | 64.7% | 2.482 | +0.023 |
| Step 40 | 0.35 | 64.7% | 2.482 | +0.023 |
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# Load base model
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-30B-A3B")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-30B-A3B")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "debaterhub/sentence-selection-orpo-v3")
Task Format
Input format:
Select sentences supporting:
Claim: [claim text]
TEXT ([citation]):
[1] First sentence.
[2] Second sentence.
...
Expected output:
Selected IDs: [1, 3, 5]
License
Apache 2.0
Framework Versions
- PEFT 0.15.2
- Transformers 4.57.3
- PyTorch 2.9.0
- Downloads last month
- 1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support