Instructions to use meseretbolled/Tenacious-Qwen3-DPO-v01 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use meseretbolled/Tenacious-Qwen3-DPO-v01 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("meseretbolled/Tenacious-Qwen3-DPO-v01", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use meseretbolled/Tenacious-Qwen3-DPO-v01 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="meseretbolled/Tenacious-Qwen3-DPO-v01", max_seq_length=2048, )
Update README.md
Browse files
README.md
CHANGED
|
@@ -1,22 +1,68 @@
|
|
| 1 |
---
|
| 2 |
base_model: unsloth/Qwen3-1.7B
|
|
|
|
|
|
|
|
|
|
| 3 |
tags:
|
| 4 |
- text-generation-inference
|
| 5 |
- transformers
|
| 6 |
- unsloth
|
| 7 |
- qwen3
|
| 8 |
- trl
|
| 9 |
-
|
| 10 |
-
|
| 11 |
-
-
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
-
|
| 17 |
-
- **License:** apache-2.0
|
| 18 |
-
- **Finetuned from model :** unsloth/Qwen3-1.7B
|
| 19 |
|
| 20 |
-
This
|
| 21 |
|
| 22 |
-
[
|
|
|
|
| 1 |
---
|
| 2 |
base_model: unsloth/Qwen3-1.7B
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
license: apache-2.0
|
| 6 |
tags:
|
| 7 |
- text-generation-inference
|
| 8 |
- transformers
|
| 9 |
- unsloth
|
| 10 |
- qwen3
|
| 11 |
- trl
|
| 12 |
+
- dpo
|
| 13 |
+
- b2b-sales
|
| 14 |
+
- lora
|
| 15 |
---
|
| 16 |
|
| 17 |
+
# Tenacious-Qwen3-DPO-v01
|
| 18 |
+
|
| 19 |
+
A 16-bit LoRA adapter fine-tuned on [unsloth/Qwen3-1.7B](https://huggingface.co/unsloth/Qwen3-1.7B)
|
| 20 |
+
via Direct Preference Optimization (DPO) for **B2B sales outreach policy compliance**.
|
| 21 |
+
|
| 22 |
+
Trained as part of [Tenacious-Bench v0.1](https://github.com/Meseretbolled/Sales-Agent-Evaluation-Bench) —
|
| 23 |
+
a domain-specific benchmark for Tenacious-style outreach evaluation.
|
| 24 |
+
|
| 25 |
+
## Evaluation Results (52 held-out tasks)
|
| 26 |
+
|
| 27 |
+
| Metric | Score |
|
| 28 |
+
|--------|-------|
|
| 29 |
+
| Base model (Qwen3-1.7B) | 0.751 |
|
| 30 |
+
| This adapter | **0.941** |
|
| 31 |
+
| Delta A | **+0.1904** |
|
| 32 |
+
| 95% CI (10k bootstrap) | [0.1115, 0.2788] |
|
| 33 |
+
| p-value (one-tailed) | 0.0000 |
|
| 34 |
+
|
| 35 |
+
## Training Details
|
| 36 |
+
|
| 37 |
+
| Setting | Value |
|
| 38 |
+
|---------|-------|
|
| 39 |
+
| Algorithm | DPO (Rafailov et al., NeurIPS 2023) |
|
| 40 |
+
| Base model | unsloth/Qwen3-1.7B |
|
| 41 |
+
| Quantization | None — 16-bit LoRA (fp16) |
|
| 42 |
+
| LoRA rank | r=16, alpha=32 |
|
| 43 |
+
| Training pairs | 159 preference pairs |
|
| 44 |
+
| Steps | 60 (3 epochs, batch size 8) |
|
| 45 |
+
| Final loss | 0.1035 |
|
| 46 |
+
| Hardware | Google Colab T4 (free tier) |
|
| 47 |
+
| Training time | 11.6 minutes |
|
| 48 |
+
| Framework | Unsloth + TRL PatchDPOTrainer |
|
| 49 |
+
|
| 50 |
+
## What it learns
|
| 51 |
+
|
| 52 |
+
The adapter trains the model to:
|
| 53 |
+
- Avoid banned phrases (urgency language, over-commitment)
|
| 54 |
+
- Ground every claim in the supplied hiring signal brief
|
| 55 |
+
- Never reference a prospect's layoffs as a buying signal
|
| 56 |
+
- Always include a calendar link
|
| 57 |
+
- Match Tenacious tone markers (professional, signal-specific, brief)
|
| 58 |
+
|
| 59 |
+
## Dataset
|
| 60 |
+
|
| 61 |
+
[Tenacious-Bench v0.1](https://github.com/Meseretbolled/Sales-Agent-Evaluation-Bench) —
|
| 62 |
+
238 tasks, 159 DPO preference pairs used for training.
|
| 63 |
|
| 64 |
+
## Made with Unsloth
|
|
|
|
|
|
|
| 65 |
|
| 66 |
+
This model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth).
|
| 67 |
|
| 68 |
+
[](https://github.com/unslothai/unsloth)
|