Nesso-0.4B-Instruct Q6_K GGUF
Small Italian/English model (~0.4B parameters) based on mii-llm/nesso-0.4B-instruct, quantized to Q6_K using llama.cpp.
Features
- Quantization: Q6_K (excellent balance between quality, speed, and RAM usage)
- Format: GGUF (compatible with llama.cpp, Ollama, LM Studio, KoboldCPP, etc.)
- File size: approximately 580 MB
- Base: converted from Hugging Face โ GGUF f16 โ quantized to Q6_K
How to use it
Example with llama.cpp:
./llama-cli -m nesso-0.4B-Q6_K.gguf --temp 0.7 --repeat-penalty 1.15 -p "Your prompt here"
- Downloads last month
- 9
Hardware compatibility
Log In to add your hardware
6-bit