Instructions to use kai-os/Carnice-V2-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use kai-os/Carnice-V2-27b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="kai-os/Carnice-V2-27b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("kai-os/Carnice-V2-27b")
model = AutoModelForImageTextToText.from_pretrained("kai-os/Carnice-V2-27b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use kai-os/Carnice-V2-27b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "kai-os/Carnice-V2-27b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/kai-os/Carnice-V2-27b

SGLang

How to use kai-os/Carnice-V2-27b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "kai-os/Carnice-V2-27b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "kai-os/Carnice-V2-27b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "kai-os/Carnice-V2-27b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use kai-os/Carnice-V2-27b with Docker Model Runner:
```
docker model run hf.co/kai-os/Carnice-V2-27b
```

Carnice-V2-27b / benchmarks /raw /logs /server_adapter.log

kai-os

Add files using upload-large-folder tool

31a7782 verified 29 days ago

raw

history blame contribute delete

12 kB

	Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
	MODEL_LOADER base_model=Qwen/Qwen3.6-27B loader=AutoModelForImageTextToText
	`torch_dtype` is deprecated! Use `dtype` instead!
	Current Python version 3.10 is below the recommended 3.11 version. It is recommended to upgrade to Python 3.11 or higher for the best experience.
	Loading weights: 0%\| \| 0/1184 [00:00<?, ?it/s] Loading weights: 2%\|▏ \| 27/1184 [00:00<00:04, 245.94it/s] Loading weights: 4%\|▍ \| 52/1184 [00:00<00:06, 185.91it/s] Loading weights: 7%\|▋ \| 77/1184 [00:00<00:05, 208.62it/s] Loading weights: 8%\|▊ \| 99/1184 [00:00<00:05, 184.84it/s] Loading weights: 10%\|█ \| 120/1184 [00:00<00:05, 188.53it/s] Loading weights: 12%\|█▏ \| 144/1184 [00:00<00:05, 200.57it/s] Loading weights: 14%\|█▍ \| 165/1184 [00:00<00:05, 190.84it/s] Loading weights: 16%\|█▌ \| 186/1184 [00:00<00:05, 191.28it/s] Loading weights: 17%\|█▋ \| 206/1184 [00:01<00:05, 176.47it/s] Loading weights: 19%\|█▉ \| 227/1184 [00:01<00:05, 182.87it/s] Loading weights: 21%\|██▏ \| 253/1184 [00:01<00:04, 198.36it/s] Loading weights: 23%\|██▎ \| 274/1184 [00:01<00:04, 196.45it/s] Loading weights: 25%\|██▍ \| 294/1184 [00:01<00:04, 182.90it/s] Loading weights: 26%\|██▋ \| 313/1184 [00:01<00:04, 176.36it/s] Loading weights: 29%\|██▉ \| 342/1184 [00:01<00:04, 203.94it/s] Loading weights: 31%\|███ \| 363/1184 [00:01<00:04, 194.07it/s] Loading weights: 32%\|███▏ \| 384/1184 [00:01<00:04, 196.05it/s] Loading weights: 34%\|███▍ \| 404/1184 [00:02<00:04, 194.91it/s] Loading weights: 36%\|███▌ \| 424/1184 [00:02<00:04, 179.16it/s] Loading weights: 38%\|███▊ \| 451/1184 [00:02<00:03, 197.81it/s] Loading weights: 40%\|███▉ \| 472/1184 [00:02<00:03, 178.19it/s] Loading weights: 42%\|████▏ \| 501/1184 [00:02<00:03, 203.91it/s] Loading weights: 44%\|████▍ \| 523/1184 [00:02<00:03, 186.77it/s] Loading weights: 46%\|████▌ \| 544/1184 [00:02<00:03, 188.63it/s] Loading weights: 48%\|████▊ \| 568/1184 [00:02<00:03, 199.78it/s] Loading weights: 50%\|████▉ \| 589/1184 [00:03<00:03, 190.86it/s] Loading weights: 52%\|█████▏ \| 610/1184 [00:03<00:03, 191.08it/s] Loading weights: 53%\|█████▎ \| 630/1184 [00:03<00:03, 169.92it/s] Loading weights: 56%\|█████▌ \| 659/1184 [00:03<00:02, 198.80it/s] Loading weights: 57%\|█████▋ \| 680/1184 [00:03<00:02, 184.99it/s] Loading weights: 59%\|█████▉ \| 702/1184 [00:03<00:02, 190.00it/s] Loading weights: 61%\|██████ \| 722/1184 [00:03<00:02, 189.53it/s] Loading weights: 63%\|██████▎ \| 742/1184 [00:03<00:02, 177.87it/s] Loading weights: 65%\|██████▍ \| 766/1184 [00:04<00:02, 192.66it/s] Loading weights: 66%\|██████▋ \| 786/1184 [00:04<00:02, 181.29it/s] Loading weights: 68%\|██████▊ \| 808/1184 [00:04<00:02, 186.72it/s] Loading weights: 70%\|██████▉ \| 827/1184 [00:04<00:01, 183.95it/s] Loading weights: 71%\|███████▏ \| 846/1184 [00:04<00:01, 174.27it/s] Loading weights: 90%\|████████▉ \| 1064/1184 [00:04<00:00, 713.87it/s] Loading weights: 100%\|██████████\| 1184/1184 [00:04<00:00, 255.11it/s]
	LORA_ATTACHMENT_SUMMARY {"linear_attn": 240, "self_attn": 64, "total": 304}
	INFO: Started server process [255381]
	INFO: Waiting for application startup.
	INFO: Application startup complete.
	INFO: Uvicorn running on http://127.0.0.1:8030 (Press CTRL+C to quit)
	INFO: 127.0.0.1:45362 - "GET /health HTTP/1.1" 200 OK
	INFO: 127.0.0.1:45374 - "GET /v1/models HTTP/1.1" 200 OK
	The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6255 completion_tokens=16 elapsed=8.0s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6295 completion_tokens=20 elapsed=9.2s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6336 completion_tokens=16 elapsed=7.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6380 completion_tokens=21 elapsed=9.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6415 completion_tokens=28 elapsed=12.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6474 completion_tokens=16 elapsed=7.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6516 completion_tokens=20 elapsed=9.2s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6566 completion_tokens=20 elapsed=9.2s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6607 completion_tokens=16 elapsed=7.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6646 completion_tokens=74 elapsed=31.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6747 completion_tokens=30 elapsed=13.4s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6814 completion_tokens=30 elapsed=13.4s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6863 completion_tokens=29 elapsed=13.0s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6929 completion_tokens=29 elapsed=13.0s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=6995 completion_tokens=23 elapsed=10.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7055 completion_tokens=19 elapsed=8.9s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7088 completion_tokens=16 elapsed=7.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7130 completion_tokens=19 elapsed=8.9s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7163 completion_tokens=16 elapsed=7.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7200 completion_tokens=20 elapsed=9.3s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7241 completion_tokens=20 elapsed=9.3s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7282 completion_tokens=30 elapsed=13.4s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7349 completion_tokens=30 elapsed=13.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7398 completion_tokens=29 elapsed=13.1s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7464 completion_tokens=29 elapsed=13.0s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7530 completion_tokens=23 elapsed=10.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7590 completion_tokens=19 elapsed=8.9s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7623 completion_tokens=19 elapsed=8.9s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7656 completion_tokens=19 elapsed=8.9s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7704 completion_tokens=20 elapsed=9.4s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=7745 completion_tokens=20 elapsed=9.4s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3547 completion_tokens=35 elapsed=15.1s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3616 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3656 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3696 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3736 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3776 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3816 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3856 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3896 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3936 completion_tokens=19 elapsed=8.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=3976 completion_tokens=19 elapsed=8.5s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4016 completion_tokens=19 elapsed=8.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4056 completion_tokens=19 elapsed=8.6s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4096 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4136 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4176 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4216 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4256 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4296 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4336 completion_tokens=19 elapsed=8.7s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK
	COMPLETION_GENERATION_COMPLETE prompt_tokens=4376 completion_tokens=19 elapsed=8.8s
	INFO: 127.0.0.1:45380 - "POST /v1/completions HTTP/1.1" 200 OK