Hello Mistral... I missed you!
#12
by SuperbEmphasis - opened
So glad you are back! Thank you for sharing this model! I have a couple H100's I am about to toss this onto. I am very excited! I am just waiting for the NVFP4 or FP8 quants :D
h100 doesnt even have fp4 compute
So...? The vllm marlin kernel works great for nvfp4 models on hopper cards. You dont get the performance bump of the FP4 tensor cores, but if you need the extra vram for kv cache it works well.
Also I just realized that this model is already in FP8. Though I am not sure if I can run it with the license...