Model Card for ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4

This is an NVFP4 quantization of TheDrummer/Cydonia-24B-v4.3.

Quantization Details

Used https://github.com/ealexeev/llm-quantization script.

Calibration dataset size: 512 Calibration data:

  • HuggingFaceH4/ultrachat_200k
  • allenai/c4_en
  • mrcedric98/fiction_books_v8

These were shuffled and mixed at a ratio of 3:2:3

Procedure

python ./quantize_nvfp4.py --model TheDrummer/Cydonia-24B-v4.3 --output ./TheDrummer/Cydonia-24B-v4.3-NVFP4 --size 512 --seed 42 --ultra_chat 3 --c4_en 2 --fiction_v8 3

I ran a grid search of calibration samples (32, 64, 128, 256, 512, 1024, 4096). While lower sample counts (128/256) improved instruction following, they significantly degraded the model's ability to handle nuance (Winogrande). This 512-sample version was the only one that fully recovered the ambiguity resolution capabilities of the base model, making it the best choice for this creative/roleplay architecture.

Quantization Evals

Metric Base Model (BF16) NVFP4 (Quantized) Delta
Winogrande (Ambiguity Resolution) 77.19% 77.27% +0.08%
HellaSwag (Flow/Common Sense) 84.19% 83.16% -1.03%
Lambada (Perplexity) 2.78 2.88 +0.10
IFEval (Strict Instruction Following) 53.97% 53.23% -0.74%
ARC Challenge (Logic/Reasoning) 68.43% 65.78% -2.65%

Note: It almost matches the base model in Nuance and Perplexity (the "vibes" metrics), trading off a small amount of raw logic.

Bias, Risks, and Limitations

This is a quantization of a creative/roleplay fine-tune. It is optimized for fiction, dialogue, etc. It is not going to do exactly what it is told or win your coding competition for you.

How To Use

vllm serve ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4 \
    --tensor-parallel-size 1 \      # Fits on 1x GPU (24GB+)
    --gpu-memory-utilization 0.8 \  # Leave room for KV Cache
Downloads last month
9
Safetensors
Model size
14B params
Tensor type
BF16
·
F32
·
F8_E4M3
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4

Datasets used to train ealexeev/TheDrummer-Cydonia-24B-v4.3-NVFP4