Model Card for ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4

This is an NVFP4 quantization of TheDrummer/Cydonia-24B-v4.3.

Quantization Details

Used https://github.com/ealexeev/llm-quantization script.

Calibration dataset size: 512 Calibration data:

HuggingFaceH4/ultrachat_200k
allenai/c4_en
mrcedric98/fiction_books_v8

These were shuffled and mixed at a ratio of 3:2:3

Procedure

python ./quantize_nvfp4.py --model TheDrummer/Cydonia-24B-v4.3 --output ./TheDrummer/Cydonia-24B-v4.3-NVFP4 --size 512 --seed 42 --ultra_chat 3 --c4_en 2 --fiction_v8 3

I ran a grid search of calibration samples (32, 64, 128, 256, 512, 1024, 4096). While lower sample counts (128/256) improved instruction following, they significantly degraded the model's ability to handle nuance (Winogrande). This 512-sample version was the only one that fully recovered the ambiguity resolution capabilities of the base model, making it the best choice for this creative/roleplay architecture.

Quantization Evals

Metric	Base Model (BF16)	NVFP4 (Quantized)	Delta
Winogrande (Ambiguity Resolution)	77.19%	77.27%	+0.08%
HellaSwag (Flow/Common Sense)	84.19%	83.16%	-1.03%
Lambada (Perplexity)	2.78	2.88	+0.10
IFEval (Strict Instruction Following)	53.97%	53.23%	-0.74%
ARC Challenge (Logic/Reasoning)	68.43%	65.78%	-2.65%

Note: It almost matches the base model in Nuance and Perplexity (the "vibes" metrics), trading off a small amount of raw logic.

Bias, Risks, and Limitations

This is a quantization of a creative/roleplay fine-tune. It is optimized for fiction, dialogue, etc. It is not going to do exactly what it is told or win your coding competition for you.

How To Use

vllm serve ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4 \
    --tensor-parallel-size 1 \      # Fits on 1x GPU (24GB+)
    --gpu-memory-utilization 0.8 \  # Leave room for KV Cache

Downloads last month: 9

Safetensors

Model size

14B params

Tensor type

BF16

F32

F8_E4M3

Model tree for ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4

Base model

mistralai/Mistral-Small-3.1-24B-Base-2503

Finetuned

mistralai/Mistral-Small-3.2-24B-Instruct-2506

Finetuned

TheDrummer/Cydonia-24B-v4.3

Quantized

(29)

this model

ealexeev
/

TheDrummer-Cydonia-24B-v4.3-NVFP4