image

BlueStar v3

Qwen3.5 27B
01 Overview

Designed for RP and writing tasks.

Dunno if it's better than v2 but I like it. Main difference is just the addition of some RP reasoning data from GLM5 & K2.5.

Non thinking and thinking are both supported. If you want to use thinking, it is required to prefill the <think>\n as that is how it was trained.

02 SillyTavern Settings
Recommended Roleplay Format
ActionsIn plaintext
Dialogue"In quotes"
Thoughts*In asterisks*
Recommended Samplers
Temp0.8 - 1.0
MinP0.05 - 0.075
03 Quantizations
GGUF
iMatrix
04 Creation Process

Creation Process: SFT

SFT on approx 56 million tokens.

Same as v2 for the most part with one big difference. Chub dataset was replaced with another version that has reasoning that was trained on the last turn only. This explodes the dataset out to 56 million tokens, but means the multi-turn reasoning gets trained correctly.

Also added a subset of 200 Gryphe RP samples that were shown as having a high lexical difference from my current dataset.

Trained using Axolotl.

Axolotl Config
SFT (4×H200)
base_model: Qwen/Qwen3.5-27B
 
plugins:
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
strict: false
 
datasets:
  - path: ./data/bluestar_v4_sft_2_masked_20260402_120553.jsonl
 
val_set_size: 0.03
output_dir: ./Qwen3.5-27B-v3-SFT-2
 
sequence_len: 10756
sample_packing: true
 
load_in_8bit: true
adapter: lora
lora_r: 128
lora_alpha: 128
peft_use_rslora: true
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - down_proj
  - up_proj
  # Uncomment below to also target the linear attention projections.
  # These use separate in_proj_qkv / in_proj_z / out_proj (Qwen3.5-specific).
  - linear_attn.in_proj_qkv
  - linear_attn.in_proj_z
  - linear_attn.out_proj
 
wandb_project: Qwen3.5-27B-SFT
wandb_name: Qwen3.5-27B-v3-SFT-2
 
gradient_accumulation_steps: 4
micro_batch_size: 1
num_epochs: 2
optimizer: adamw_torch_8bit
lr_scheduler: cosine
learning_rate: 1.2e-5
weight_decay: 0.01
warmup_ratio: 0.05
 
bf16: auto
tf32: true
 
resume_from_checkpoint:
logging_steps: 1
flash_attention: true
 
evals_per_epoch: 4
saves_per_epoch: 4
special_tokens:
 
fsdp_config:
  fsdp_version: 2
  offload_params: false
  cpu_ram_efficient_loading: false
  auto_wrap_policy: TRANSFORMER_BASED_WRAP
  transformer_layer_cls_to_wrap: Qwen3_5DecoderLayer
  state_dict_type: FULL_STATE_DICT
  sharding_strategy: FULL_SHARD
  reshard_after_forward: true
  activation_checkpointing: true
Downloads last month
92
Safetensors
Model size
28B params
Tensor type
BF16
·
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ApocalypseParty/Qwen3.5-27B-v3-SFT-2

Base model

Qwen/Qwen3.5-27B
Finetuned
(262)
this model

Datasets used to train ApocalypseParty/Qwen3.5-27B-v3-SFT-2