Qwopus3.5-27B-v3-Abliterated-TQ3_4S

TQ3_4S is a 3.5-bit Walsh-Hadamard-transform weight format with four per-8 scales per 32-weight block.

This release is a TQ3_4S GGUF quantization of croll83/Qwopus3.5-27B-v3-Abliterated, derived from the Qwen3.5-27B family.

Quantization Source

  • HF source checkout:
    • croll83/Qwopus3.5-27B-v3-Abliterated
  • upstream family:
    • Qwen/Qwen3.5-27B
  • F16 GGUF used as the quantization source:
    • Qwopus3.5-27B-v3-Abliterated-f16.gguf

Quantized with:

./build/bin/llama-quantize --pure \
  /path/to/Qwopus3.5-27B-v3-Abliterated-f16.gguf \
  /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  TQ3_4S \
  16

Runtime Validation

Validated on public llama.cpp-tq3:

  • runtime repo:
    • turbo-tan/llama.cpp-tq3
  • no-thinking server smoke:
    • runtime: llama-server --reasoning off
    • prompt: Write ONLY the word ok.
    • response: ok

Example

./build/bin/llama-simple-chat \
  -m /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  -ngl 99 -c 2048

Server example:

./build/bin/llama-server \
  -m /path/to/Qwopus3.5-27B-v3-Abliterated-TQ3_4S.gguf \
  --host 127.0.0.1 --port 8080 \
  -ngl 99 -c 8192 -np 1 \
  -ctk q8_0 -ctv q8_0 -fa on \
  --no-warmup --jinja

Notes

  • This is a weight quantization release.
  • The source repo includes Qwopus3.5-27B-v3-Abliterated-mmproj.gguf.
  • Upload mmproj.gguf alongside this model if you want to preserve the same multimodal packaging.
  • Running this GGUF requires TQ3_4S runtime support from turbo-tan/llama.cpp-tq3.

Credits

Downloads last month
893
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for YTan2000/Qwopus3.5-27B-v3-Abliterated-TQ3_4S

Base model

Qwen/Qwen3.5-27B
Quantized
(176)
this model