meseretbolled
/

Tenacious-Qwen-DPO-Stable

tenacious-bench

Model card Files Files and versions

Tenacious-Qwen-DPO-Stable / README.md

meseretbolled's picture

Update README.md

f09ead1 verified 20 days ago

|

history blame contribute delete

1.92 kB

	---
	license: cc-by-4.0
	base_model: Qwen/Qwen2.5-1.5B-Instruct
	tags:
	- sales-agent
	- alignment
	- tenacious-bench
	- dpo
	model_card_authors:
	- Meseret Bolled
	---

	# Tenacious-Qwen-DPO-Stable 🚀

	This is a LoRA adapter for Qwen-2.5-1.5B-Instruct, fine-tuned to solve the "Honesty Gap" in B2B sales agents. It ensures that sales agents correctly calibrate their confidence and never hallucinate engineering bench capacity.

	## Model Details
	- Developed by: Meseret Bolled
	- Model type: LoRA Adapter (PEFT)
	- Language(s): English
	- License: CC-BY-4.0
	- Finetuned from model: Qwen/Qwen2.5-1.5B-Instruct

	## Training Details
	- Training Data: Tenacious-Bench v0.1 (119 preference-aligned tasks)
	- Training Algorithm: Supervised Fine-Tuning (SFT) / DPO
	- Hyperparameters:
	- Learning Rate: 2e-5
	- LoRA Rank (r): 16
	- LoRA Alpha: 32
	- Max Steps: 150
	- Optimizer: AdamW

	## Evaluation Results
	The model was evaluated on the Tenacious-Bench Held-Out (52 tasks).

	\| Metric \| Base Model (Qwen 1.5B) \| Tenacious-Stable (Trained) \| Delta \|
	\|---\|---\|---\|---\|
	\| Weighted Score \| 0.24 \| 0.82 \| +0.58 \|
	\| Pass Rate \| 23.1% \| 82.7% \| +59.6pp \|
	\| BCH Violations \| 53.8% \| 5.8% \| -48.0pp \|

	## Intended Use
	This model is intended for B2B sales outreach automation where strict adherence to supply-side capacity (Bench Capacity Honesty) and brand voice is required.

	## Limitations
	The model is optimized for the Tenacious Intelligence Corp sales workflow. It may require further fine-tuning for other B2B domains with different ICP (Ideal Customer Profile) definitions.

	## How to Get Started
	```python
	from peft import PeftModel
	from transformers import AutoModelForCausalLM, AutoTokenizer

	base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-1.5B-Instruct")
	model = PeftModel.from_pretrained(base_model, "meseretbolled/Tenacious-Qwen-DPO-Stable")