typhoon-si-med-thinking-4b-research-preview - GGUF

About

For a convenient overview and download list, visit our model page.

If you are unsure how to use GGUF files, refer to the llama.cpp documentation for more details.

./llama-cli -m typhoon-si-med-thinking-4b-research-preview-q4_k_m.gguf -p "Hello!"

(sorted by size, not necessarily quality)

Link	Type	Size/GB	Notes
GGUF	q2_k	1.67	very low quality, for testing
GGUF	q3_k_m	2.09
GGUF	q4_0	2.41
GGUF	q4_k_m	2.53	recommended, good balance
GGUF	q5_k_m	2.94
GGUF	q8_0	4.37	near-full precision

Special thanks to the llama.cpp team for their amazing work.

GGUF

Model size

4B params

Architecture

qwen3

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

8-bit

Base model

Finetuned

Quantized

(3)

this model