Vintern-1B-v3.5 GGUF ❄️

Converted using llama.cpp@b6301 (self-compiled, CUDA Toolkit 13.0) convert-hf-to-gguf.py file.

To run this model, use:

llama-server -m Vintern-1B-v3_5-BF16.gguf --mmproj mmproj-Vintern-1B-v3_5-BF16.gguf --repeat-penalty 2.5

Add -ngl 99 if using compatible GPU, or custom set --threads for fine-tuned performance.

MMPROJ file generated from original model's preprocessor_config.json: https://huggingface.co/OpenGVLab/InternVL2_5-1B

GGUF

Model size

0.6B params

Architecture

qwen2

Hardware compatibility

16-bit

Model tree for minhnguyenh/Vintern-1B-v3_5-BF16-GGUF

Base model

Finetuned

Quantized

(3)

this model