Gliese-Qwen3.5-4B-Abliterated-Caption-NVFP4
Gliese-Qwen3.5-4B-Abliterated-Caption-NVFP4 is an NVFP4-compressed variant built on top of prithivMLmods/Gliese-Qwen3.5-4B-Abliterated-Caption. This version leverages F32 · BF16 · F8_E4M3 · U8 precision formats to significantly reduce memory footprint and improve inference efficiency while maintaining strong output quality. It preserves the original model’s character and applies advanced refusal direction analysis alongside abliterated training strategies to minimize internal refusal behaviors while maximizing descriptive capability and visual understanding. The result is a powerful 4B parameter vision-language model optimized for highly detailed captions, deep scene understanding, and rich visual descriptions, now with improved deployment efficiency.
This model is intended for research and learning purposes only. Due to reduced internal refusal mechanisms, it may generate sensitive or unfiltered content. Users assume full responsibility for how the model is used. The authors and hosting platform disclaim any liability for generated outputs.
Key Highlights
NVFP4 Compression Utilizes mixed precision (F32 · BF16 · F8_E4M3 · U8) to reduce VRAM usage and accelerate inference.
Advanced Refusal Direction Analysis Uses targeted activation analysis to identify and mitigate refusal directions within the model’s latent space.
Abliterated Caption Training Fine-tuned for unfiltered and highly detailed caption generation, enabling comprehensive visual descriptions.
Optimized Visual Understanding Enhanced to produce rich, context-aware descriptions of scenes, objects, people, and environments.
4B Parameter Architecture Built on Qwen3.5-4B, delivering strong multimodal reasoning with efficient deployment.
High-Fidelity Caption Generation Designed for long-form, structured, and semantically rich captions suitable for dataset generation and research.
Efficient Deployment Ideal for caption dataset creation, multimodal research, and local inference pipelines with reduced hardware requirements.
Quick Start with Transformers
pip install transformers==5.4.0
# or
pip install git+https://github.com/huggingface/transformers.git
from transformers import Qwen3_5ForConditionalGeneration, AutoProcessor
import torch
model = Qwen3_5ForConditionalGeneration.from_pretrained(
"prithivMLmods/Gliese-Qwen3.5-4B-Abliterated-Caption-NVFP4",
torch_dtype="auto",
device_map="auto"
)
processor = AutoProcessor.from_pretrained(
"prithivMLmods/Gliese-Qwen3.5-4B-Abliterated-Caption-NVFP4"
)
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image in extreme detail."}
],
}
]
text = processor.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = processor(
text=[text],
padding=True,
return_tensors="pt"
).to("cuda")
generated_ids = model.generate(**inputs, max_new_tokens=512)
generated_ids_trimmed = [
out_ids[len(in_ids):] for in_ids, out_ids in zip(inputs.input_ids, generated_ids)
]
output_text = processor.batch_decode(
generated_ids_trimmed,
skip_special_tokens=True,
clean_up_tokenization_spaces=False
)
print(output_text)
Intended Use
- High-Detail Image Captioning – Generating extremely descriptive captions for images.
- Dataset Generation – Creating large-scale caption datasets for multimodal training.
- Vision-Language Research – Studying multimodal reasoning and captioning behavior.
- Annotation Automation – Assisting in automatic labeling and visual description tasks.
- Efficient Multimodal Deployment – Running captioning models with reduced VRAM requirements.
Limitations & Risks
Important Note: This model intentionally minimizes built-in refusal mechanisms.
- Unfiltered Outputs – May generate explicit or controversial captions depending on input images.
- User Responsibility – Outputs must be handled within legal and ethical boundaries.
- Compression Trade-offs – NVFP4 may introduce minor precision or consistency degradation in edge cases.
- Model Size Constraints – A 4B model still has limitations compared to larger multimodal systems.
- Downloads last month
- 266
Model tree for prithivMLmods/Gliese-Qwen3.5-4B-Abliterated-Caption-NVFP4
Base model
Qwen/Qwen3.5-4B-Base