Rocinante-X-12B-v1-GPTQ

GPTQ quantized version of TheDrummer/Rocinante-X-12B-v1.

Quantization Details

Parameter	Value
Bits	4
Group Size	128
Desc Act	False
Format	Safetensors
Calibration	WikiText-2 (128 samples)
Original Size	~24 GB (BF16)
Quantized Size	~5.6 GB

About the Original Model

Rocinante-X-12B-v1 is a 12B parameter model by TheDrummer, fine-tuned from Mistral-Nemo-Instruct-2407 for creative writing, roleplay, and entertainment. It prioritizes creativity and usability over pure alignment.

Key strengths:

Pleasant, compelling prose and storytelling
Strong instruction adherence
Handles complex themes maturely
Supports <thinking></thinking> tags
Chat template: Mistral v3 Tekken (recommended)

Usage

With vLLM

python -m vllm.entrypoints.openai.api_server \
    --model Irvollo/Rocinante-X-12B-v1-GPTQ \
    --quantization gptq \
    --dtype float16

With transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Irvollo/Rocinante-X-12B-v1-GPTQ",
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Irvollo/Rocinante-X-12B-v1-GPTQ")

Credits

Original model by TheDrummer
Quantized by Irvollo

Downloads last month: 32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tacodevs/Rocinante-X-12B-v1-GPTQ

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

mistralai/Mistral-Nemo-Instruct-2407

Finetuned

TheDrummer/Rocinante-X-12B-v1

Quantized

(9)

this model