Hello Mistral... I missed you!

#12

by SuperbEmphasis - opened 6 days ago

So glad you are back! Thank you for sharing this model! I have a couple H100's I am about to toss this onto. I am very excited! I am just waiting for the NVFP4 or FP8 quants :D

szilard995

6 days ago

h100 doesnt even have fp4 compute

SuperbEmphasis

6 days ago

•

edited 6 days ago

So...? The vllm marlin kernel works great for nvfp4 models on hopper cards. You dont get the performance bump of the FP4 tensor cores, but if you need the extra vram for kv cache it works well.

Also I just realized that this model is already in FP8. Though I am not sure if I can run it with the license...

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment