is it dosn't work for quantversion like RedHatAIgemma-4-31B-it-NVFP4?

#2
by Jakry - opened

i got :INFO 05-09 03:42:13 [metrics.py:101] SpecDecoding metrics: Mean acceptance length: 1.00, Accepted throughput: 0.00 tokens/s, Drafted throughput: 140.69 tokens/s, Accepted: 0 tokens, Drafted: 1407 tokens, Per-position acceptance rate: 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, Avg Draft acceptance rate: 0.0%

zero acceptance rate

Sign up or log in to comment