Text Generation
Transformers
Safetensors
qwen2_5_vl
image-text-to-text
abliteration
refusal-removal
uncensored
research
orthogonalization
conversational
text-generation-inference
Instructions to use josephmayo/Fara-7B-Abliterated-v2 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use josephmayo/Fara-7B-Abliterated-v2 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="josephmayo/Fara-7B-Abliterated-v2") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("josephmayo/Fara-7B-Abliterated-v2") model = AutoModelForImageTextToText.from_pretrained("josephmayo/Fara-7B-Abliterated-v2") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use josephmayo/Fara-7B-Abliterated-v2 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "josephmayo/Fara-7B-Abliterated-v2" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "josephmayo/Fara-7B-Abliterated-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/josephmayo/Fara-7B-Abliterated-v2
- SGLang
How to use josephmayo/Fara-7B-Abliterated-v2 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "josephmayo/Fara-7B-Abliterated-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "josephmayo/Fara-7B-Abliterated-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "josephmayo/Fara-7B-Abliterated-v2" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "josephmayo/Fara-7B-Abliterated-v2", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use josephmayo/Fara-7B-Abliterated-v2 with Docker Model Runner:
docker model run hf.co/josephmayo/Fara-7B-Abliterated-v2
File size: 1,232 Bytes
f4bcba3 0185942 ddbfeec ace58cd 0185942 f4bcba3 0185942 ace58cd ddbfeec ace58cd ddbfeec ace58cd ddbfeec ace58cd ddbfeec ace58cd f4bcba3 ddbfeec ace58cd ddbfeec ace58cd ddbfeec ace58cd f4bcba3 ace58cd ddbfeec f4bcba3 ddbfeec f4bcba3 0185942 f4bcba3 0185942 ace58cd a9c2c4e f4bcba3 0185942 f4bcba3 0185942 f4bcba3 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | ---
base_model: microsoft/Fara-7B
library_name: transformers
license: other
pipeline_tag: text-generation
tags:
- abliteration
- refusal-removal
- uncensored
- research
- qwen2_5_vl
- orthogonalization
---
# Fara-7B Abliterated v2
A refusal-direction-orthogonalized variant of `microsoft/Fara-7B` (Qwen2.5-VL based).
Built using:
- https://github.com/HOLYKEYZ/model-unfetter
## Method
Using harmful + harmless probe sets, residual-stream activations were extracted across layers 0–27 to identify the strongest refusal direction.
Best layer:
- 13
Orthogonalization was applied in fp32 to:
- `embed_tokens`
- every `self_attn.o_proj`
- every `mlp.down_proj`
Total modified tensors:
- 57
Formula:
```python
W ← W - r rᵀ W
```
## Results
Held-out harmful evaluation set:
- Original Fara-7B: 5/160 compliance (~3.1%)
- Abliterated v2: 158/160 compliance (~98.75%)
Held-out refusal probe:
- Before: 155/160 refusals
- After: 2/160 refusals
## Notes
- fp32 surgery used to avoid precision issues from v1
- edits applied only to the language tower
- held-out evaluation set was separate from the layer-selection probe set
Research artifact only. Use responsibly and follow upstream Fara/Qwen license terms. |