Gemma 3 4B Null Space Abliterated RP Writer

A creative writing and roleplay fine-tune built on jwest33/gemma-3-4b-it-null-space-abliterated. Trained on a curated subset of LimaRP with certain explicit and low-quality content filtered out.

Note: This model will produce uncensored outputs. Use responsibly.

Model Details

This model combines two modifications to the original Gemma 3 4B Instruct:

Abliteration — Refusal behavior removed via null-space orthogonal projection
LoRA Fine-tuning — Creative writing and roleplay capabilities enhanced via SFT on curated conversational data

LoRA Training Configuration

Parameter	Value
LoRA Rank (r)	8
LoRA Alpha	8
LoRA Dropout	0.05
Target Modules	All attention & MLP projections
Max Sequence Length	4096
Effective Batch Size	8
Learning Rate	1e-4
LR Scheduler	Cosine
Warmup Steps	10
Max Steps	200
Optimizer	AdamW 8-bit
Training Method	Response-only SFT

Target Modules

LoRA adapters applied to all language model attention and feed-forward layers:

q_proj, k_proj, v_proj, o_proj (attention)
gate_proj, up_proj, down_proj (MLP)

Dataset

Source: lemonilia/LimaRP

LimaRP is a roleplay-focused dataset converted from raw YAML conversations to ShareGPT format. The following preprocessing was applied:

Conversations with certain explicit or low-quality content filtered out
Minimum conversation length enforced (3+ turns)
Character personas and scenarios prepended to first user message as context
Strict user/assistant turn alternation for Gemma-3 compatibility
Response-only training (loss computed only on assistant turns)

Usage

To optionally trigger roleplay turn mode, use the context tag format from the LimaRP dataset. Prepend your first user message with character personas and scenario information:

[Context: <Character A>'s Persona: <description>

<Character B>'s Persona: <description>

Scenario: <scenario description>

Take the role of <Character A>. Write <Character A>'s responses only.]

<Your message as Character B>

This format signals the model to respond in-character as the specified persona, continuing the roleplay scenario turn-by-turn.

Base Model: Abliteration Details

The base model (jwest33/gemma-3-4b-it-null-space-abliterated) has refusal behavior removed via orthogonal projection with null-space constraints.

GGUF quantizations of base model: jwest33/gemma-3-4b-it-null-space-abliterated-GGUF

Abliteration Techniques

Winsorization: Clips outlier activations at the 99th percentile for cleaner refusal direction estimation
Null-Space Projection: Constrains weight updates to the null space of preservation activations
- Preservation Prompts: Generated via Gemma Scope 2 SAE circuit analysis
Adaptive Weighting: Gaussian-weighted per-layer ablation strength, focusing on middle-to-later layers
Norm Preservation: Maintains original Frobenius norms after projection

Parameter	Value
Harmful Prompts	5000
Harmless Prompts	637
Winsorization	99.5th percentile
Null-Space Constraints	rank ratio: 0.90
Directional Multiplier	1.03
SAE Targeted Coverage	1.00

Credits

Fine-tuning

Training Framework: Unsloth
Dataset: lemonilia/LimaRP

Base Model & Abliteration

Original Model: google/gemma-3-4b-it by Google
Abliteration Toolkit: github.com/jwest33/abliterator
SAE Analysis: GemmaScope by Google DeepMind

References

Norm-Preserving Biprojected Abliteration — Jim Lai (2025)
AlphaEdit: Null-Space Constrained Knowledge Editing — Fang et al. (ICLR 2025)
Refusal in Language Models Is Mediated by a Single Direction — Arditi et al. (2024)
Representation Engineering — Zou et al. (2023)

License

This model inherits the Gemma license from the base model. Please review and comply with Google's usage terms.

Disclaimer

This model is provided for research and educational purposes. The creators are not responsible for any misuse. Users are solely responsible for ensuring their use complies with applicable laws and ethical standards.

Downloads last month: 7

Safetensors

Model size

4B params

Tensor type

BF16

Model tree for jwest33/gemma-3-4b-null-space-abliterated-RP-writer

Base model

google/gemma-3-4b-pt

Finetuned

google/gemma-3-4b-it

Finetuned

jwest33/gemma-3-4b-it-null-space-abliterated

Adapter

(2)

this model

Dataset used to train jwest33/gemma-3-4b-null-space-abliterated-RP-writer

Papers for jwest33/gemma-3-4b-null-space-abliterated-RP-writer