Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Umezaki
/
dpo-qwen-expG-adapter
like
0
Text Generation
PEFT
Safetensors
English
dpo
lora
qwen
structeval
conversational
License:
apache-2.0
Model card
Files
Files and versions
xet
Community
Use this model
DPO LoRA Adapter (ExpK) for Qwen3-4B (StructEval)
DPO LoRA Adapter (ExpK) for Qwen3-4B (StructEval)
SFT(ExpG) + DPO, r=8, alpha=8
SFT adapter merged from Exp G
DPO r=8, alpha=8
LR=5e-7, epoch=1, beta=0.1
max_length=2048
Downloads last month
3
Inference Providers
NEW
Text Generation
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for
Umezaki/dpo-qwen-expG-adapter
Base model
Qwen/Qwen3-4B-Instruct-2507
Adapter
(
5273
)
this model