Qwen 3.5 27B Anko

A Doubao Seed 2.0 distillation on top of Qwen 3.5 27B, intended to increase the quality of the reasoning and decrease looping, and fix slop in outputs.

Recommended Settings

DO NOT USE QWEN'S SAMPLERS. THEY ARE AWFUL.

This one tested with temperature of 1.25 and a min_p of 0.05 to 0.1, but YMMV and you may find better results with other samplers.

For assistant tasks, it was trained to use a Claude system prompt:

You are Claude, a helpful and harmless language model created by Anthropic.

and we recommend using this prompt to achieve best capabilities.

Training Process

This model is a basic r=64,a=512* LoRA on reasoning traces and responses (as well as non-thinking responses) generated primarily by Doubao Seed 2.0 Pro, as well as Doubao Seed 2.0 Mini for some synthetic story tasks, as during data generation it refused erotic tasks a lot less often and creative output was mostly on par.

* This is equivalent to a r=64,a=64 rsLoRA, but some frameworks do not properly implement rsLoRA support.