ayghri (Ayoub G.)

commentedon 🔢 INT4 vs FP4: The Future of 4-Bit Quantization 4 days ago

I think the claim that NVFP4 is superior to INT4 still needs more evidence.

I've run some tests on Qwen3 4B: INT4 with same scale format (FP8 E4M3) used in Blackwell (1 scale per 16 elements): block scaled INT4 achieved better KL divergence. On benchmarks like PIQA and HellaSwag, NVFP4 did slightly better.

To benefit from NVFP4 hardware, both multiplicands have to be in NVFP4, it can be achieved by quantizing activations on the fly. So in addition to QAT you've mentioned for the weights, the model has to be adapted to accepted NVFP4 activations as well.

Ayoub G.

AI & ML interests

Recent Activity

Organizations

Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model

Ayoub G.

AI & ML interests

Recent Activity

Organizations

ayghri's activity