Qwen3-32B SFT+DPO on r/dadjokes corpus

For more details, see the original blogpost: https://nixiesearch.substack.com/p/fine-tuning-qwen3-at-home-to-respond

Running this model

The model is distributed as an QLoRa adaptor you can attach to the base Qwen-32B with VLLM:

vllm serve Qwen/Qwen3-32B 
  --compilation-config {"cudagraph_specialize_lora": "False"} \
  --max_num_seqs 64 \
  --quantization bitsandbytes \
  --max_model_len 2048 \
  --gpu_memory_utilization 0.85 \
  --enable-lora --lora-modules dadjokes=<path to adapter dir>

License

Apache 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support