Really good bang for buck!

by layer4down - opened Mar 12

Mar 12

Excellent model! Very smart and fast inference. Bucked my 30GB Q8 model for this 16GB NVFP4. Currently running in LM Studio. Hoping to see a release with KV cache and parallel prediction support!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment