Instructions to use meseretbolled/Tenacious-Qwen3-DPO-v01 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use meseretbolled/Tenacious-Qwen3-DPO-v01 with Transformers:
# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("meseretbolled/Tenacious-Qwen3-DPO-v01", dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- Unsloth Studio new
How to use meseretbolled/Tenacious-Qwen3-DPO-v01 with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for meseretbolled/Tenacious-Qwen3-DPO-v01 to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="meseretbolled/Tenacious-Qwen3-DPO-v01", max_seq_length=2048, )
Tenacious-Qwen3-DPO-v01
A 16-bit LoRA adapter fine-tuned on unsloth/Qwen3-1.7B via Direct Preference Optimization (DPO) for B2B sales outreach policy compliance.
Trained as part of Tenacious-Bench v0.1 โ a domain-specific benchmark for Tenacious-style outreach evaluation.
Evaluation Results (52 held-out tasks)
| Metric | Score |
|---|---|
| Base model (Qwen3-1.7B) | 0.751 |
| This adapter | 0.941 |
| Delta A | +0.1904 |
| 95% CI (10k bootstrap) | [0.1115, 0.2788] |
| p-value (one-tailed) | 0.0000 |
Training Details
| Setting | Value |
|---|---|
| Algorithm | DPO (Rafailov et al., NeurIPS 2023) |
| Base model | unsloth/Qwen3-1.7B |
| Quantization | None โ 16-bit LoRA (fp16) |
| LoRA rank | r=16, alpha=32 |
| Training pairs | 159 preference pairs |
| Steps | 60 (3 epochs, batch size 8) |
| Final loss | 0.1035 |
| Hardware | Google Colab T4 (free tier) |
| Training time | 11.6 minutes |
| Framework | Unsloth + TRL PatchDPOTrainer |
What it learns
The adapter trains the model to:
- Avoid banned phrases (urgency language, over-commitment)
- Ground every claim in the supplied hiring signal brief
- Never reference a prospect's layoffs as a buying signal
- Always include a calendar link
- Match Tenacious tone markers (professional, signal-specific, brief)
Dataset
Tenacious-Bench v0.1 โ 238 tasks, 159 DPO preference pairs used for training.
Made with Unsloth
This model was trained 2x faster with Unsloth.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
