UwU-Qwen3.5-27B-v0.1
Experimental. Early-stage fine-tune. Expect rough edges.
Qwen3.5-27B fine-tune for creative writing, roleplay, and reasoning. ~20M tokens (~8.5K samples) of curated data. Retains the base model's vision-language (VL) capabilities.
What This Model Is For
Tuned toward prose-style creative output rather than assistant-style responses.
- Descriptive, immersive narrative across genres and tones
- Character voice and consistency in multi-turn roleplay
- Thinking mode shaped for creative reasoning (narrative structure, character motivation, scene pacing) instead of mechanical step-by-step logic
- Vision-language input supported (inherited from Qwen3.5 VL architecture)
Training Data
~8.5K samples, mixed:
- Creative writing. Single-turn fiction across genres (literary, noir, horror, romance, sci-fi, etc.) and tones (lyrical, gritty, humorous, sensual, etc.). High-quality curated sources.
- Multi-turn roleplay. Extended conversations with character cards. Trains context tracking and character consistency across turns, with diverse archetypes and dynamics.
- Reasoning. Filtered reasoning data to keep the base model's analytical ability intact and to steer thinking mode toward structured creative thought.
Model Details
| Feature | Description |
|---|---|
| Base Model | Qwen/Qwen3.5-27B |
| Architecture | Qwen3.5 VL (27B Dense + MTP) |
| Precision | bf16 |
| Context Length | 131,072 tokens |
Thinking Mode
Supports Qwen3.5's native thinking mode:
- Thinking. Reasons in
<think>block before responding. Useful for complex scenes or multi-character interactions. - No-think. Responds immediately with an empty
<think>block. Better for fast, fluid exchanges.
Usage
vLLM
vllm serve Ks01/UwU-Qwen3.5-27B-v0.1 \
--trust-remote-code \
--max-model-len 131072 \
--gpu-memory-utilization 0.90
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Ks01/UwU-Qwen3.5-27B-v0.1",
torch_dtype="bfloat16",
device_map="auto",
trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained("Ks01/UwU-Qwen3.5-27B-v0.1")
messages = [
{"role": "user", "content": "Write a tense reunion scene between two old friends at a rainy bus stop."}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=2048, temperature=0.7)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Limitations
- Experimental release (v0.1). Quality may be inconsistent.
- Context consistency can degrade in long multi-turn conversations.
- Thinking mode occasionally falls into analytical patterns instead of creative reasoning.
Credits
Fine-tuned from ArliAI/Qwen-3.5-27B-Derestricted, based on Qwen/Qwen3.5-27B.
License
Apache 2.0, inherited from Qwen3.5.
- Downloads last month
- 63