Rocinante-X-12B-v1-GPTQ

GPTQ quantized version of TheDrummer/Rocinante-X-12B-v1.

Quantization Details

Parameter Value
Bits 4
Group Size 128
Desc Act False
Format Safetensors
Calibration WikiText-2 (128 samples)
Original Size ~24 GB (BF16)
Quantized Size ~5.6 GB

About the Original Model

Rocinante-X-12B-v1 is a 12B parameter model by TheDrummer, fine-tuned from Mistral-Nemo-Instruct-2407 for creative writing, roleplay, and entertainment. It prioritizes creativity and usability over pure alignment.

Key strengths:

  • Pleasant, compelling prose and storytelling
  • Strong instruction adherence
  • Handles complex themes maturely
  • Supports <thinking></thinking> tags
  • Chat template: Mistral v3 Tekken (recommended)

Usage

With vLLM

python -m vllm.entrypoints.openai.api_server \
    --model Irvollo/Rocinante-X-12B-v1-GPTQ \
    --quantization gptq \
    --dtype float16

With transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Irvollo/Rocinante-X-12B-v1-GPTQ",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Irvollo/Rocinante-X-12B-v1-GPTQ")

Credits

Downloads last month
32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tacodevs/Rocinante-X-12B-v1-GPTQ