Using llama.cpp

./llama-server \
    --port 1234 \
    --host 0.0.0.0 \
    --gpu-layers 99 \
    --alias ttrpg/Mistral-7B-v0.3 \
    --ctx-size 32768 \
    --chat-template mistral-v3 \
    -hf ttrpg/Mistral-7B-v0.3-gguf:Q4_K_M
Downloads last month
2
GGUF
Model size
7B params
Architecture
llama
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ttrpg/Mistral-7B-v0.3-gguf

Quantized
(1)
this model