harrier77
/

nesso-0.4B-Q6_K.gguf

Text Generation

Model card Files Files and versions

Nesso-0.4B-Instruct Q6_K GGUF

Small Italian/English model (~0.4B parameters) based on mii-llm/nesso-0.4B-instruct, quantized to Q6_K using llama.cpp.

Features

Quantization: Q6_K (excellent balance between quality, speed, and RAM usage)
Format: GGUF (compatible with llama.cpp, Ollama, LM Studio, KoboldCPP, etc.)
File size: approximately 580 MB
Base: converted from Hugging Face → GGUF f16 → quantized to Q6_K

How to use it

Example with llama.cpp:

./llama-cli -m nesso-0.4B-Q6_K.gguf --temp 0.7 --repeat-penalty 1.15 -p "Your prompt here"

Downloads last month: 9

GGUF

Model size

0.6B params

Architecture

llama

Hardware compatibility

Log In to add your hardware

6-bit