Really good bang for buck!
#1
by layer4down - opened
Excellent model! Very smart and fast inference. Bucked my 30GB Q8 model for this 16GB NVFP4. Currently running in LM Studio. Hoping to see a release with KV cache and parallel prediction support!