Really good bang for buck!

#1
by layer4down - opened

Excellent model! Very smart and fast inference. Bucked my 30GB Q8 model for this 16GB NVFP4. Currently running in LM Studio. Hoping to see a release with KV cache and parallel prediction support!

Sign up or log in to comment