Qwen3-32B SFT+DPO on r/dadjokes corpus
For more details, see the original blogpost: https://nixiesearch.substack.com/p/fine-tuning-qwen3-at-home-to-respond
Running this model
The model is distributed as an QLoRa adaptor you can attach to the base Qwen-32B with VLLM:
vllm serve Qwen/Qwen3-32B
--compilation-config {"cudagraph_specialize_lora": "False"} \
--max_num_seqs 64 \
--quantization bitsandbytes \
--max_model_len 2048 \
--gpu_memory_utilization 0.85 \
--enable-lora --lora-modules dadjokes=<path to adapter dir>
License
Apache 2.0
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support