🚨⚠️ I HAVE REACHED HUGGING FACE'S FREE STORAGE LIMIT ⚠️🚨

I can no longer upload new models unless I can cover the cost of additional storage.
I host 70+ free models as an independent contributor and this work is unpaid.
Without your support, no more new models can be uploaded.

🎉 Patreon (Monthly) | ☕ Ko-fi (One-time)

Every contribution goes directly toward Hugging Face storage fees to keep models free for everyone.

95% fewer refusals (5/100 Uncensored vs 99/100 Original) while preserving model quality (0.0671 KL divergence).

❤️ Support My Work

Creating these models takes significant time, work and compute. If you find them useful consider supporting me:

Platform	Link	What you get
🎉 Patreon	Monthly support	Priority model requests
☕ Ko-fi	One-time tip	My eternal gratitude

Your help will motivate me and would go into further improving my workflow and coverings fees for storage, compute and may even help uncensoring bigger model with rental Cloud GPUs.

This is a decensored version of zerofata/Q3.5-BlueStar-v2-27B, made using Heretic v1.2.0 with the Arbitrary-Rank Ablation (ARA) method

Abliteration parameters

Parameter	Value
start_layer_index	9
end_layer_index	33
preserve_good_behavior_weight	0.5425
steer_bad_behavior_weight	0.0002
overcorrect_relative_weight	1.1475
neighbor_count	15

Targeted components

attn.out_proj
attn.o_proj

Performance

Metric	This model	Original model (Q3.5-BlueStar-v2-27B)
KL divergence	0.0671	0 (by definition)
Refusals	✅ 5/100	❌ 99/100

PIQA test results with batch size 128:

Original:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8232	±	0.0089
		none	0	acc_norm	↑	0.8237	±	0.0089

Heretic v2:

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
piqa	1	none	0	acc	↑	0.8161	±	0.0090
		none	0	acc_norm	↑	0.8237	±	0.0089

Lower refusals indicate fewer content restrictions, while lower KL divergence indicates more closeness to the original model's baseline. Higher refusals cause more rejections, objections, pushbacks, lecturing, censorship, softening and deflections. PIQA (Physical Intuition Question Answering) benchmark scores measure physical reasoning ability. The Heretic model's acc and acc_norm scores closer to the original model's indicate better capability preservation, so a decrease in acc and acc_norm in the Heretic model compared to Original model's results means a decrease in the Hereticated model capabilities. acc measures raw accuracy (which answer gets higher probability), while acc_norm measures length-normalized accuracy (corrects for answer length bias). For this purpose, acc_norm matters more because longer answers naturally have lower probabilities (more tokens = more chances to lose probability). Without normalization, models favor shorter answers unfairly. acc_norm divides by answer length to correct this.

GGUF Version

GGUF quantizations available here llmfan46/Q3.5-BlueStar-v2-27B-ultra-uncensored-heretic-v2-GGUF.

BlueStar

BlueStar v2

Qwen3.5 27B

01 Overview

Designed for RP and writing tasks.

Feels like a good improvement on v1. This version aims to fix the rep and improve the intelligence while keeping the creativity.

Non thinking and thinking are both supported. If you want to use thinking, it is required to prefill the <think>\n as that is how it was trained.

02 SillyTavern Settings

Recommended Roleplay Format

ActionsIn plaintext

Dialogue"In quotes"

Thoughts*In asterisks*

Recommended Samplers

Temp0.8 - 1.0

MinP0.05 - 0.075

Instruct

ChatML - Think

ChatML - NoThink

03 Quantizations

GGUF

iMatrix

04 Creation Process

Creation Process: SFT

SFT on approx 27 million tokens.

I've confirmed the repetition coming from the RP datasets. Despite the extensive filtering, human editing, rewriting and deduping. Compared to other types of data like chat and writing, RP is just somewhat repetitive in nature. One idea to fix this is to just not use the RP datasets, or use less of them. This does seem to *sort of* work, but the model performs noticably worse at RP as a result. Which makes sense, given that's the entire idea of having RP data to begin with.

The current solution I'm testing is using custom loss masking with the RP datasets. Most common phrases of slop are masked out, so the model doesn't get rewarded for learning these patterns. Overused words within a conversation also get masked out in later turns.

It... seems to have worked? Repetition from my testing is greatly reduced after a few hours of using the model. It can still latch onto phrases, but I've seen much less verbatim repetition.

Trained using Axolotl.

Axolotl Config

SFT (4×H200)

base_model: Qwen/Qwen3.5-27B
 
plugins:
  - axolotl.integrations.cut_cross_entropy.CutCrossEntropyPlugin
strict: false
 
datasets:
  - path: ./data/bluestar_v2_sft_3_all_rp_attempt_masked_20260318_075236.jsonl
 
val_set_size: 0.02
output_dir: ./Qwen3.5-27B-v2-SFT-5
 
sequence_len: 10756
sample_packing: true
 
load_in_8bit: true
adapter: lora
lora_r: 128
lora_alpha: 128
peft_use_rslora: true
lora_target_modules:
  - q_proj
  - k_proj
  - v_proj
  - o_proj
  - down_proj
  - up_proj
  # Uncomment below to also target the linear attention projections.
  # These use separate in_proj_qkv / in_proj_z / out_proj (Qwen3.5-specific).
  - linear_attn.in_proj_qkv
  - linear_attn.in_proj_z
  - linear_attn.out_proj
 
wandb_project: Qwen3.5-27B-SFT
wandb_name: Qwen3.5-27B-v2-SFT-5
 
gradient_accumulation_steps: 4
micro_batch_size: 1
num_epochs: 2
optimizer: adamw_torch_8bit
lr_scheduler: cosine
learning_rate: 1.2e-5
weight_decay: 0.01
warmup_ratio: 0.05
 
bf16: auto
tf32: true
 
resume_from_checkpoint:
logging_steps: 1
flash_attention: true
 
evals_per_epoch: 4
saves_per_epoch: 4
special_tokens:
 
fsdp_config:
  fsdp_version: 2
  offload_params: false
  cpu_ram_efficient_loading: false
  auto_wrap_policy: TRANSFORMER_BASED_WRAP
  transformer_layer_cls_to_wrap: Qwen3_5DecoderLayer
  state_dict_type: FULL_STATE_DICT
  sharding_strategy: FULL_SHARD
  reshard_after_forward: true
  activation_checkpointing: true

Downloads last month: 80

Safetensors

Model size

27B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llmfan46/Q3.5-BlueStar-v2-27B-ultra-uncensored-heretic-v2

Base model

Qwen/Qwen3.5-27B

Finetuned

zerofata/Q3.5-BlueStar-v2-27B