I would be interested in some more NVF4P Quants

by DigitalSpellcaster - opened 26 days ago

Discussion

DigitalSpellcaster

26 days ago

Thanks for the Quant, I am interested in exploring more.

Firworks

Owner 25 days ago

Are there any specific models you're interested in that don't currently have NVFP4 quants?

DigitalSpellcaster

24 days ago

First off—thank you for even entertaining this request, you're an absolute legend. I appreciate you helping me unlock the power of this Blackwell silicon. (5060Ti 16GB)

To avoid quality loss, I’d like these to be mastered using the source weights. Here are the repos:

❖ 21B MoE: DavidAU/L3.2-8X4B-MOE-V2-Dark-Champion-Inst-21B-uncen-ablit

❖ 16.5B Doom: DavidAU/L3-DARKEST-PLANET-16.5B

❖ 16B Heretic: DavidAU/L3-Darkest-Planet-16B-HERETIC-Uncensored-Abliterated

For the technical side, I’m hoping for:
❖ Format: E2M1 (Standard NVFP4).
❖ Block Size: 16 (for that Blackwell sweet spot).
❖ Scaling: If you can do FP8 (E4M3) micro-scaling with an FP32 tensor scale, that would be the dream.

Thanks in advance for anything you can do, and just for reading my request.

Firworks

Owner 24 days ago

I ran the two that I could.

https://huggingface.co/Firworks/L3-Darkest-Planet-16B-HERETIC-Uncensored-Abliterated-nvfp4
https://huggingface.co/Firworks/L3-DARKEST-PLANET-16.5B-nvfp4

The model card made them sound pretty specialized and strange. I wasn't sure exactly the right way to run them but I did test both to verify they respond with coherent text in VLLM. I was getting ~20tok/s on the DGX Spark. You should see faster speeds than that on the 5060 Ti.

I couldn't run DavidAU/L3.2-8X4B-MOE-V2-Dark-Champion-Inst-21B-uncen-ablit as I don't see it in non GGUF format ?

DigitalSpellcaster

21 days ago

•

edited 21 days ago

Thank you very much! having such high fidelity models in that specific format is going to sing on my Blackwell CPU. I really appreciate the work you've done. I understand if you couldn't find the final one, I'll do some digging.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment