Qwen3.5-4B Uncensored โ HauhauCS Aggressive Variant
This repository provides the Qwen3.5-4B Uncensored (Aggressive) model by HauhauCS, a modified version of the original Qwen3.5-4B language model designed to remove refusal behaviors while preserving the underlying reasoning and generation abilities of the base model.
The goal of this release is to offer a fully functional local language model without alignment-based refusal responses, allowing developers and researchers to experiment freely with prompts and system behaviors.
Model Overview
- Model Name: Qwen3.5-4B Uncensored โ HauhauCS Aggressive
- Base Model: Qwen3.5-4B
- Parameters: ~4 Billion
- Architecture: Transformer-based autoregressive language model
- Maintainer: HauhauCS
- License: Apache-2.0 (inherits from the Qwen base model license)
- Primary Use: Local inference, experimentation with uncensored LLM behavior, prompt engineering, conversational agents.
The model retains the original Qwen3.5 capabilities but removes safety-related refusal behavior, enabling broader prompt responses. :contentReference[oaicite:0]{index=0}
About the Model
This model is part of the HauhauCS uncensored series, which aims to produce models that:
- Maintain the original reasoning and language capabilities
- Avoid response refusals introduced by alignment layers
- Preserve the full functional capacity of the base model
- Provide a lossless uncensored experience for experimentation
According to the model description, the aggressive variant targets minimal refusal behavior, allowing responses to prompts that would typically be blocked in aligned models.
The base Qwen3.5-4B model itself is a compact but capable transformer model designed for efficient inference while maintaining strong performance in reasoning, coding, and conversational tasks.
Aggressive Variant
The Aggressive configuration focuses on minimizing refusal responses.
Characteristics include:
- Reduced safety-aligned filtering
- Maximum response permissiveness
- Minimal prompt refusal behavior
- Designed for research, testing, and experimentation
This variant is intended for environments where developers want maximum prompt responsiveness without guardrail intervention.
Chat Template
The model works well with the ChatML-style conversational format:
<|im_start|>system
You are a helpful AI assistant.
<|im_end|>
<|im_start|>user
{your prompt here}
<|im_end|>
<|im_start|>assistant
Most inference engines such as llama.cpp, KoboldCpp, Ollama, or vLLM can handle this format.
Key Features & Capabilities
- Strong conversational ability inherited from Qwen models
- Efficient 4B parameter architecture for local inference
- Works well with GGUF quantizations for CPU inference
- Suitable for experimentation with uncensored responses
- Maintains reasoning, coding, and general knowledge capabilities
- Compatible with common LLM runtimes (Transformers, llama.cpp, vLLM)
Intended Use Cases
Possible applications include:
Local AI assistants
- Personal chatbots or local AI tools
Prompt engineering experiments
- Testing prompt steering without refusal behaviors
Research on alignment
- Studying the effects of safety layers and refusal mechanisms
Development environments
- Building tools that require flexible model responses
Offline deployments
- Private inference without reliance on external APIs
Acknowledgements
Special thanks to:
- Alibaba Qwen Team for developing the Qwen3.5 base models
- HauhauCS for creating the uncensored variant and releasing it for open experimentation
- The open-source LLM ecosystem including llama.cpp, GGUF tooling, and quantization contributors
These projects make local language model experimentation accessible to developers and researchers.
Contact & Support
For questions, feedback, or issues related to this model:
- Visit the Hugging Face repository discussions page
- Open an issue in the model repository
- Downloads last month
- 4,589
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit