Instructions to use kai-os/Carnice-V2-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kai-os/Carnice-V2-27b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="kai-os/Carnice-V2-27b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("kai-os/Carnice-V2-27b")
model = AutoModelForImageTextToText.from_pretrained("kai-os/Carnice-V2-27b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use kai-os/Carnice-V2-27b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kai-os/Carnice-V2-27b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/kai-os/Carnice-V2-27b

SGLang

How to use kai-os/Carnice-V2-27b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kai-os/Carnice-V2-27b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kai-os/Carnice-V2-27b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use kai-os/Carnice-V2-27b with Docker Model Runner:
```
docker model run hf.co/kai-os/Carnice-V2-27b
```

Carnice-V2-27b

File size: 8,738 Bytes

31a7782

MODEL_LOADER base_model=Qwen/Qwen3.6-27B loader=AutoModelForImageTextToText
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
`torch_dtype` is deprecated! Use `dtype` instead!
Current Python version 3.10 is below the recommended 3.11 version. It is recommended to upgrade to Python 3.11 or higher for the best experience.

Loading weights:   0%|          | 0/1184 [00:00<?, ?it/s]
Loading weights:   2%|▏         | 27/1184 [00:00<00:04, 264.77it/s]
Loading weights:   5%|▍         | 54/1184 [00:00<00:05, 189.88it/s]
Loading weights:   7%|▋         | 80/1184 [00:00<00:05, 206.93it/s]
Loading weights:   9%|▊         | 102/1184 [00:00<00:06, 179.80it/s]
Loading weights:  11%|█         | 129/1184 [00:00<00:05, 203.93it/s]
Loading weights:  13%|█▎        | 151/1184 [00:00<00:05, 191.25it/s]
Loading weights:  15%|█▍        | 172/1184 [00:00<00:05, 193.59it/s]
Loading weights:  16%|█▌        | 192/1184 [00:00<00:05, 193.40it/s]
Loading weights:  18%|█▊        | 212/1184 [00:01<00:05, 182.46it/s]
Loading weights:  20%|██        | 238/1184 [00:01<00:04, 192.98it/s]
Loading weights:  22%|██▏       | 258/1184 [00:01<00:04, 185.95it/s]
Loading weights:  24%|██▎       | 279/1184 [00:01<00:04, 189.57it/s]
Loading weights:  26%|██▌       | 303/1184 [00:01<00:04, 202.59it/s]
Loading weights:  27%|██▋       | 324/1184 [00:01<00:04, 192.90it/s]
Loading weights:  29%|██▉       | 345/1184 [00:01<00:04, 193.80it/s]
Loading weights:  31%|███       | 365/1184 [00:01<00:04, 172.82it/s]
Loading weights:  33%|███▎      | 394/1184 [00:02<00:03, 202.66it/s]
Loading weights:  35%|███▌      | 416/1184 [00:02<00:04, 190.84it/s]
Loading weights:  37%|███▋      | 437/1184 [00:02<00:03, 193.55it/s]
Loading weights:  39%|███▊      | 457/1184 [00:02<00:03, 193.10it/s]
Loading weights:  40%|████      | 477/1184 [00:02<00:03, 184.19it/s]
Loading weights:  42%|████▏     | 503/1184 [00:02<00:03, 202.26it/s]
Loading weights:  44%|████▍     | 524/1184 [00:02<00:03, 178.35it/s]
Loading weights:  46%|████▌     | 545/1184 [00:02<00:03, 182.13it/s]
Loading weights:  48%|████▊     | 571/1184 [00:02<00:03, 197.72it/s]
Loading weights:  50%|█████     | 592/1184 [00:03<00:03, 195.08it/s]
Loading weights:  52%|█████▏    | 612/1184 [00:03<00:03, 181.17it/s]
Loading weights:  53%|█████▎    | 631/1184 [00:03<00:03, 174.99it/s]
Loading weights:  56%|█████▌    | 659/1184 [00:03<00:02, 198.43it/s]
Loading weights:  57%|█████▋    | 680/1184 [00:03<00:02, 188.35it/s]
Loading weights:  59%|█████▉    | 702/1184 [00:03<00:02, 193.28it/s]
Loading weights:  61%|██████    | 722/1184 [00:03<00:02, 192.58it/s]
Loading weights:  63%|██████▎   | 742/1184 [00:03<00:02, 180.42it/s]
Loading weights:  65%|██████▍   | 764/1184 [00:04<00:02, 189.85it/s]
Loading weights:  66%|██████▌   | 784/1184 [00:04<00:02, 187.73it/s]
Loading weights:  68%|██████▊   | 804/1184 [00:04<00:01, 190.25it/s]
Loading weights:  70%|██████▉   | 824/1184 [00:04<00:02, 179.08it/s]
Loading weights:  71%|███████   | 843/1184 [00:04<00:01, 172.04it/s]
Loading weights:  90%|█████████ | 1069/1184 [00:04<00:00, 733.37it/s]
Loading weights: 100%|██████████| 1184/1184 [00:04<00:00, 256.72it/s]
INFO:     Started server process [255925]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://127.0.0.1:8030 (Press CTRL+C to quit)
INFO:     127.0.0.1:53980 - "GET /health HTTP/1.1" 200 OK
INFO:     127.0.0.1:53986 - "GET /v1/models HTTP/1.1" 200 OK
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
COMPLETION_GENERATION_COMPLETE prompt_tokens=6255 completion_tokens=20 elapsed=8.8s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6296 completion_tokens=16 elapsed=6.8s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6340 completion_tokens=21 elapsed=8.6s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6375 completion_tokens=28 elapsed=11.1s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6434 completion_tokens=23 elapsed=9.3s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6484 completion_tokens=30 elapsed=11.8s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6543 completion_tokens=20 elapsed=8.2s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6584 completion_tokens=30 elapsed=11.9s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6651 completion_tokens=41 elapsed=15.9s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6726 completion_tokens=23 elapsed=9.3s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6786 completion_tokens=46 elapsed=17.7s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6878 completion_tokens=19 elapsed=7.9s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6911 completion_tokens=28 elapsed=11.2s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6970 completion_tokens=20 elapsed=8.2s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=7011 completion_tokens=34 elapsed=13.4s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=7099 completion_tokens=120 elapsed=44.9s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3547 completion_tokens=35 elapsed=13.5s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3616 completion_tokens=46 elapsed=17.4s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3692 completion_tokens=20 elapsed=7.9s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3733 completion_tokens=19 elapsed=7.6s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3781 completion_tokens=27 elapsed=10.5s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3836 completion_tokens=29 elapsed=11.2s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3890 completion_tokens=29 elapsed=11.3s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3948 completion_tokens=19 elapsed=7.6s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3988 completion_tokens=19 elapsed=7.6s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4028 completion_tokens=20 elapsed=8.0s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4069 completion_tokens=19 elapsed=7.7s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4110 completion_tokens=28 elapsed=11.1s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4183 completion_tokens=46 elapsed=17.8s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4250 completion_tokens=29 elapsed=11.5s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4324 completion_tokens=61 elapsed=23.4s
INFO:     127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK