Qwopus3.5-27B-v3-Abliterated-TQ3_4S

TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.

This release is a TQ3_4S GGUF quantization of croll83/Qwopus3.5-27B-v3-Abliterated, derived from the Qwen3.5-27B family.

Quantization Source

HF source checkout:
- croll83/Qwopus3.5-27B-v3-Abliterated
upstream family:
- Qwen/Qwen3.5-27B
F16 GGUF used as the quantization source:
- Qwopus3.5-27B-v3-Abliterated-f16.gguf

Quantized with:

./build/bin/llama-quantize --pure \
  /path/to/Qwopus3.5-27B-v3-Abliterated-f16.gguf \
  /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  TQ3_4S \
  16

Runtime Validation

Validated on public llama.cpp-tq3:

runtime repo:
- turbo-tan/llama.cpp-tq3
no-thinking server smoke:
- runtime: llama-server --reasoning off
- prompt: Write ONLY the word ok.
- response: ok

Example

./build/bin/llama-simple-chat \
  -m /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  -ngl 99 -c 2048

Server example:

./build/bin/llama-server \
  -m /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  --host 127.0.0.1 --port 8080 \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --no-warmup --jinja

Notes

This is a weight quantization release.
The source repo includes Qwopus3.5-27B-v3-Abliterated-mmproj.gguf.
Upload mmproj.gguf alongside this model if you want to preserve the same multimodal packaging.
Running this GGUF requires TQ3_4S runtime support from turbo-tan/llama.cpp-tq3.

Credits

Downloads last month: 893

GGUF

Model size

27B params

Architecture

qwen35

Hardware compatibility

We're not able to determine the quantization variants.

View all variants

Model tree for YTan2000/Qwopus3.5-27B-v3-Abliterated-TQ3_4S

Base model

Qwen/Qwen3.5-27B

Quantized

(176)

this model