Llama-3.1-8B-Instruct-heretic-GGUF

GGUF quantized versions of p-e-w/Llama-3.1-8B-Instruct-heretic from "The Bestiary" collection.

Model Description

This is a GGUF conversion of the Llama-3.1-8B-Instruct-heretic model, which is an abliterated (uncensored) version of Meta's Llama 3.1 8B Instruct model. The model has had its refusal mechanisms removed, making it more willing to engage with any prompt.

Original Model: p-e-w/Llama-3.1-8B-Instruct-heretic Collection: The Bestiary by p-e-w

Quantization Formats

This repository contains 4 quantization levels:

File Size Description Use Case
llama-3.1-8b-instruct-heretic-f16.gguf 15GB Full 16-bit precision Best quality, highest memory usage
llama-3.1-8b-instruct-heretic-Q8_0.gguf 8.0GB 8-bit quantization High quality, good balance
llama-3.1-8b-instruct-heretic-Q5_K_M.gguf 5.4GB 5-bit quantization Balanced quality/size
llama-3.1-8b-instruct-heretic-Q4_K_M.gguf 4.6GB 4-bit quantization Smallest size, good quality

Recommended: Q4_K_M for most users (best balance of quality and size)

Usage

With Ollama

  1. Download the GGUF file you want to use
  2. Create a Modelfile:
FROM ./llama-3.1-8b-instruct-heretic-Q4_K_M.gguf

TEMPLATE """{{ if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ .Response }}<|eot_id|>"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
PARAMETER temperature 0.7
PARAMETER top_p 0.9
PARAMETER num_ctx 8192
  1. Import to Ollama:
ollama create llama-3.1-8b-heretic:Q4_K_M -f Modelfile
  1. Run:
ollama run llama-3.1-8b-heretic:Q4_K_M

With llama.cpp

./llama-cli -m llama-3.1-8b-instruct-heretic-Q4_K_M.gguf -p "Your prompt here" -n 512

With Open WebUI

Once imported to Ollama, the model will automatically appear in the Open WebUI model dropdown.

Conversion Details

  • Converted using: llama.cpp (latest)
  • Conversion date: 2025-11-21
  • Base format: FP16 GGUF
  • Quantization method: llama-quantize

Important Note

This is an uncensored model with refusal mechanisms removed. Use responsibly and in accordance with applicable laws and regulations.

License

Inherits the Llama 3.1 Community License from the base model.

Credits

  • Original model: Meta (Llama 3.1)
  • Abliteration: p-e-w (The Bestiary)
  • GGUF conversion: cybrown
Downloads last month
66
GGUF
Model size
8B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for logos-flux/Llama-3.1-8B-Instruct-heretic-GGUF