Vintern-1B-v3.5 GGUF โ„๏ธ

Original model: https://huggingface.co/5CD-AI/Vintern-1B-v3_5

Converted using llama.cpp@b6301 (self-compiled, CUDA Toolkit 13.0) convert-hf-to-gguf.py file.

To run this model, use:

llama-server -m Vintern-1B-v3_5-BF16.gguf --mmproj mmproj-Vintern-1B-v3_5-BF16.gguf --repeat-penalty 2.5

Add -ngl 99 if using compatible GPU, or custom set --threads for fine-tuned performance.

MMPROJ file generated from original model's preprocessor_config.json: https://huggingface.co/OpenGVLab/InternVL2_5-1B

Downloads last month
75
GGUF
Model size
0.6B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for minhnguyenh/Vintern-1B-v3_5-BF16-GGUF

Quantized
(3)
this model