meseretbolled's picture
Update README.md
f09ead1 verified
---
license: cc-by-4.0
base_model: Qwen/Qwen2.5-1.5B-Instruct
tags:
- sales-agent
- alignment
- tenacious-bench
- dpo
model_card_authors:
- Meseret Bolled
---
# Tenacious-Qwen-DPO-Stable 🚀
This is a **LoRA adapter** for Qwen-2.5-1.5B-Instruct, fine-tuned to solve the "Honesty Gap" in B2B sales agents. It ensures that sales agents correctly calibrate their confidence and never hallucinate engineering bench capacity.
## Model Details
- **Developed by:** Meseret Bolled
- **Model type:** LoRA Adapter (PEFT)
- **Language(s):** English
- **License:** CC-BY-4.0
- **Finetuned from model:** Qwen/Qwen2.5-1.5B-Instruct
## Training Details
- **Training Data:** Tenacious-Bench v0.1 (119 preference-aligned tasks)
- **Training Algorithm:** Supervised Fine-Tuning (SFT) / DPO
- **Hyperparameters:**
- Learning Rate: 2e-5
- LoRA Rank (r): 16
- LoRA Alpha: 32
- Max Steps: 150
- Optimizer: AdamW
## Evaluation Results
The model was evaluated on the **Tenacious-Bench Held-Out (52 tasks)**.
| Metric | Base Model (Qwen 1.5B) | Tenacious-Stable (Trained) | Delta |
|---|---|---|---|
| **Weighted Score** | 0.24 | **0.82** | **+0.58** |
| **Pass Rate** | 23.1% | **82.7%** | +59.6pp |
| **BCH Violations** | 53.8% | **5.8%** | -48.0pp |
## Intended Use
This model is intended for **B2B sales outreach automation** where strict adherence to supply-side capacity (Bench Capacity Honesty) and brand voice is required.
## Limitations
The model is optimized for the **Tenacious Intelligence Corp** sales workflow. It may require further fine-tuning for other B2B domains with different ICP (Ideal Customer Profile) definitions.
## How to Get Started
```python
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
model = PeftModel.from_pretrained(base_model, "meseretbolled/Tenacious-Qwen-DPO-Stable")