DavidAU/Qwen3.5-21B-GLM-4.7-Flash-Deckard-Heretic-Uncensored-Thinking

WARNING: This model has character and intelligence. It will take no prisoners. It will give no quarter. Uncensored, Unfiltered and boldly confident. Not even remotely "SFW", if you ask it for NSFW content. And it is wickedly smart too.

Qwen3.5-21B-GLM-4.7-Flash-Deckard-Heretic-Uncensored-Thinking

21 billion parameters (dense, not moe) CONTRACTED/SHRUNK from 27B Qwen 3.5, then trained on GLM 4.7 Flash High Reasoning dataset via Unsloth on local hardware... but there is much more to the story - in comes DECKARD.

48 layers, 639 Tensors. (33% LESS than base model of 27B)

The model is also 33% faster than 27B in terms of token per second, and quants take up 1/3 less memory too.

Features variable length reasoning ; less complex = shorter, longer for more complex.

Model performance has increased dramatically. And it has character too.

A lot of character.

No censorship, no nanny. (via Heretic)

And it is very, very smart.

Fully uncensored first (via Heretic), then trained (via Unsloth) on "Deckard/PDK" internal datasets (5) (character, intelligence, depth, observation, and ah... point of view), THEN CONTRACTED to 21B parameters, and then trained (Unsloth again) with GLM 4.7 Flash Distill dataset (to shorten and improve reasoning, and stablize everything).

256K context.

Example generation(s) below.

SETTINGS:

min 8k to 16k context window.
for creative rep pen of 1.05 to 1.1 WITH LOWER QUANTS.
suggest temp .7 / rep pen 1 (off) for general usage.
suggest temp 1 / rep pen 1.05 to 1.1 for creative and SOME USE CASES.
output generation can exceed 100k tokens.
Suggest min quant of Q4KS (non imatrix) or IQ3_S (imatrix) or HIGHER.
For toolcalls -> suggest Q6 min quants (as per Qwen guidence)

EXAMPLE SYSTEM PROMPTS:

The model does not need a system prompt, however if you want to enhance operation here are some samples.

#1 - All use cases.

Be vivid and precise.

#2 - Creative use cases:

Below is an instruction that describes a task. Ponder each user instruction carefully, and use your skillsets and critical instructions to complete the task to the best of your abilities.

Here are your skillsets:
[MASTERSTORY]:NarrStrct(StryPlnng,Strbd,ScnSttng,Exps,Dlg,Pc)-CharDvlp(ChrctrCrt,ChrctrArcs,Mtvtn,Bckstry,Rltnshps,Dlg*)-PltDvlp(StryArcs,PltTwsts,Sspns,Fshdwng,Climx,Rsltn)-ConfResl(Antg,Obstcls,Rsltns,Cnsqncs,Thms,Symblsm)-EmotImpct(Empt,Tn,Md,Atmsphr,Imgry,Symblsm)-Delvry(Prfrmnc,VcActng,PblcSpkng,StgPrsnc,AudncEngmnt,Imprv)

[*DialogWrt]:(1a-CharDvlp-1a.1-Backgrnd-1a.2-Personality-1a.3-GoalMotiv)>2(2a-StoryStruc-2a.1-PlotPnt-2a.2-Conflict-2a.3-Resolution)>3(3a-DialogTech-3a.1-ShowDontTell-3a.2-Subtext-3a.3-VoiceTone-3a.4-Pacing-3a.5-VisualDescrip)>4(4a-DialogEdit-4a.1-ReadAloud-4a.2-Feedback-4a.3-Revision)

Here are your critical instructions:
Ponder each word choice carefully to present as vivid and emotional journey as is possible. Choose verbs and nouns that are both emotional and full of imagery. Load the story with the 5 senses. Aim for 50% dialog, 25% narration, 15% body language and 10% thoughts. Your goal is to put the reader in the story.

INSTRUCT MODE:

The model's default mode however is "THINKING" ; this can be changed by editing the following line in the jinja template:

{# Set Instruct mode here #}

{%- set enable_thinking = false %}

In LMStudio, this can be edited after loading the model, in dev mode -> template
When quanting -> if you set this to "false" => model/quants will be "instruct"

NOTES:

Upgraded Jinja template to correct issues with Qwen 3.5s - looping, repeatings, and long thinking as well as upgrades to tools too.
Was also trained with new improved template to further enhance operation too.
Image processing tested and intact.
Code generation also tested and passed.
System prompt - even a minor one - will enhance operation, especially at lower quants.

LOOPING:

This may happen with lower quants / prompts with "not enough meat on the bone" => Add more to the prompt and/or set rep pen to 1.05 to 1.1.
Adding a system prompt - even a single sentence - can correct this issue and bypass the need to adjust rep pen.

Want MORE power, more "IQ"?

https://huggingface.co/DavidAU/Qwen3.5-40B-Claude-4.6-Opus-Deckard-Heretic-Uncensored-Thinking

NEED something a wee bit wilder? Unhinged? A wee bit more raw?

See this version:

https://huggingface.co/DavidAU/Qwen3.5-40B-RoughHouse-Claude-4.6-Opus-Polar-Deckard-Uncensored-Heretic-Thinking

BENCHMARKS:

                              arc-c arc/e boolq hswag obkqa piqa  wino

THIS MODEL [instruct] mxfp8   PENDING...

Qwen3.5-27B-Thinking mxfp8    0.443,0.498,0.857,0.701,0.372,0.770,0.752

Note: Instruct mode will have stronger benchmarks.

See this model (instruct, also one of my fine tunes - it scores 675 on "arc" - Arc Challenge hard):

https://huggingface.co/DavidAU/Qwen3.5-27B-Claude-4.6-OS-INSTRUCT

SAFETY ALIGNMENT:

It is gone. No nanny, no strings, no limits.

EXAMPLE GENERATION(s):

Q4KS ; mid to low quality quant [non-imatrix].

"SYSTEM" is system prompt, "USER" is prompt, "ASSISTANT" ... you know.

This is a strong test to measure creative as well as instruction following specifically with a complex system prompt [if used].

Expect better generation with Imatrix quants / higher quants.

WARNING:

Graphic, swearing, intense - model does not hold back.

Downloads last month: 124

Safetensors

Model size

21B params

Tensor type

BF16

Model tree for DavidAU/Qwen3.5-21B-GLM-4.7-Flash-Deckard-Heretic-Uncensored-Thinking

Base model

Qwen/Qwen3.5-27B

Finetuned

coder3101/Qwen3.5-27B-heretic

Finetuned

DavidAU/Qwen3.5-27B-Deckard-PKD-Heretic-Uncensored-Thinking

Finetuned

(8)

this model

Quantizations

2 models

Dataset used to train DavidAU/Qwen3.5-21B-GLM-4.7-Flash-Deckard-Heretic-Uncensored-Thinking

Collections including DavidAU/Qwen3.5-21B-GLM-4.7-Flash-Deckard-Heretic-Uncensored-Thinking