Rocinante-X-12B-v1-GPTQ
GPTQ quantized version of TheDrummer/Rocinante-X-12B-v1.
Quantization Details
| Parameter | Value |
|---|---|
| Bits | 4 |
| Group Size | 128 |
| Desc Act | False |
| Format | Safetensors |
| Calibration | WikiText-2 (128 samples) |
| Original Size | ~24 GB (BF16) |
| Quantized Size | ~5.6 GB |
About the Original Model
Rocinante-X-12B-v1 is a 12B parameter model by TheDrummer, fine-tuned from Mistral-Nemo-Instruct-2407 for creative writing, roleplay, and entertainment. It prioritizes creativity and usability over pure alignment.
Key strengths:
- Pleasant, compelling prose and storytelling
- Strong instruction adherence
- Handles complex themes maturely
- Supports
<thinking></thinking>tags - Chat template: Mistral v3 Tekken (recommended)
Usage
With vLLM
python -m vllm.entrypoints.openai.api_server \
--model Irvollo/Rocinante-X-12B-v1-GPTQ \
--quantization gptq \
--dtype float16
With transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"Irvollo/Rocinante-X-12B-v1-GPTQ",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Irvollo/Rocinante-X-12B-v1-GPTQ")
Credits
- Original model by TheDrummer
- Quantized by Irvollo
- Downloads last month
- 32
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for tacodevs/Rocinante-X-12B-v1-GPTQ
Base model
mistralai/Mistral-Nemo-Base-2407 Finetuned
mistralai/Mistral-Nemo-Instruct-2407 Finetuned
TheDrummer/Rocinante-X-12B-v1