Image-Text-to-Text
Transformers
Safetensors
qwen3_5
qwen
qwen3
qwen3.6
carnice
hermes-agent
agentic
sft
bf16
merged
conversational
Instructions to use kai-os/Carnice-V2-27b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kai-os/Carnice-V2-27b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("image-text-to-text", model="kai-os/Carnice-V2-27b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] pipe(text=messages)# Load model directly from transformers import AutoProcessor, AutoModelForImageTextToText processor = AutoProcessor.from_pretrained("kai-os/Carnice-V2-27b") model = AutoModelForImageTextToText.from_pretrained("kai-os/Carnice-V2-27b") messages = [ { "role": "user", "content": [ {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"}, {"type": "text", "text": "What animal is on the candy?"} ] }, ] inputs = processor.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use kai-os/Carnice-V2-27b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "kai-os/Carnice-V2-27b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kai-os/Carnice-V2-27b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker
docker model run hf.co/kai-os/Carnice-V2-27b
- SGLang
How to use kai-os/Carnice-V2-27b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "kai-os/Carnice-V2-27b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kai-os/Carnice-V2-27b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "kai-os/Carnice-V2-27b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "kai-os/Carnice-V2-27b", "messages": [ { "role": "user", "content": [ { "type": "text", "text": "Describe this image in one sentence." }, { "type": "image_url", "image_url": { "url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg" } } ] } ] }' - Docker Model Runner
How to use kai-os/Carnice-V2-27b with Docker Model Runner:
docker model run hf.co/kai-os/Carnice-V2-27b
File size: 8,738 Bytes
31a7782 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | MODEL_LOADER base_model=Qwen/Qwen3.6-27B loader=AutoModelForImageTextToText
Warning: You are sending unauthenticated requests to the HF Hub. Please set a HF_TOKEN to enable higher rate limits and faster downloads.
`torch_dtype` is deprecated! Use `dtype` instead!
Current Python version 3.10 is below the recommended 3.11 version. It is recommended to upgrade to Python 3.11 or higher for the best experience.
Loading weights: 0%| | 0/1184 [00:00<?, ?it/s]
Loading weights: 2%|β | 27/1184 [00:00<00:04, 264.77it/s]
Loading weights: 5%|β | 54/1184 [00:00<00:05, 189.88it/s]
Loading weights: 7%|β | 80/1184 [00:00<00:05, 206.93it/s]
Loading weights: 9%|β | 102/1184 [00:00<00:06, 179.80it/s]
Loading weights: 11%|β | 129/1184 [00:00<00:05, 203.93it/s]
Loading weights: 13%|ββ | 151/1184 [00:00<00:05, 191.25it/s]
Loading weights: 15%|ββ | 172/1184 [00:00<00:05, 193.59it/s]
Loading weights: 16%|ββ | 192/1184 [00:00<00:05, 193.40it/s]
Loading weights: 18%|ββ | 212/1184 [00:01<00:05, 182.46it/s]
Loading weights: 20%|ββ | 238/1184 [00:01<00:04, 192.98it/s]
Loading weights: 22%|βββ | 258/1184 [00:01<00:04, 185.95it/s]
Loading weights: 24%|βββ | 279/1184 [00:01<00:04, 189.57it/s]
Loading weights: 26%|βββ | 303/1184 [00:01<00:04, 202.59it/s]
Loading weights: 27%|βββ | 324/1184 [00:01<00:04, 192.90it/s]
Loading weights: 29%|βββ | 345/1184 [00:01<00:04, 193.80it/s]
Loading weights: 31%|βββ | 365/1184 [00:01<00:04, 172.82it/s]
Loading weights: 33%|ββββ | 394/1184 [00:02<00:03, 202.66it/s]
Loading weights: 35%|ββββ | 416/1184 [00:02<00:04, 190.84it/s]
Loading weights: 37%|ββββ | 437/1184 [00:02<00:03, 193.55it/s]
Loading weights: 39%|ββββ | 457/1184 [00:02<00:03, 193.10it/s]
Loading weights: 40%|ββββ | 477/1184 [00:02<00:03, 184.19it/s]
Loading weights: 42%|βββββ | 503/1184 [00:02<00:03, 202.26it/s]
Loading weights: 44%|βββββ | 524/1184 [00:02<00:03, 178.35it/s]
Loading weights: 46%|βββββ | 545/1184 [00:02<00:03, 182.13it/s]
Loading weights: 48%|βββββ | 571/1184 [00:02<00:03, 197.72it/s]
Loading weights: 50%|βββββ | 592/1184 [00:03<00:03, 195.08it/s]
Loading weights: 52%|ββββββ | 612/1184 [00:03<00:03, 181.17it/s]
Loading weights: 53%|ββββββ | 631/1184 [00:03<00:03, 174.99it/s]
Loading weights: 56%|ββββββ | 659/1184 [00:03<00:02, 198.43it/s]
Loading weights: 57%|ββββββ | 680/1184 [00:03<00:02, 188.35it/s]
Loading weights: 59%|ββββββ | 702/1184 [00:03<00:02, 193.28it/s]
Loading weights: 61%|ββββββ | 722/1184 [00:03<00:02, 192.58it/s]
Loading weights: 63%|βββββββ | 742/1184 [00:03<00:02, 180.42it/s]
Loading weights: 65%|βββββββ | 764/1184 [00:04<00:02, 189.85it/s]
Loading weights: 66%|βββββββ | 784/1184 [00:04<00:02, 187.73it/s]
Loading weights: 68%|βββββββ | 804/1184 [00:04<00:01, 190.25it/s]
Loading weights: 70%|βββββββ | 824/1184 [00:04<00:02, 179.08it/s]
Loading weights: 71%|βββββββ | 843/1184 [00:04<00:01, 172.04it/s]
Loading weights: 90%|βββββββββ | 1069/1184 [00:04<00:00, 733.37it/s]
Loading weights: 100%|ββββββββββ| 1184/1184 [00:04<00:00, 256.72it/s]
INFO: Started server process [255925]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://127.0.0.1:8030 (Press CTRL+C to quit)
INFO: 127.0.0.1:53980 - "GET /health HTTP/1.1" 200 OK
INFO: 127.0.0.1:53986 - "GET /v1/models HTTP/1.1" 200 OK
The following generation flags are not valid and may be ignored: ['temperature', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.
COMPLETION_GENERATION_COMPLETE prompt_tokens=6255 completion_tokens=20 elapsed=8.8s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6296 completion_tokens=16 elapsed=6.8s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6340 completion_tokens=21 elapsed=8.6s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6375 completion_tokens=28 elapsed=11.1s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6434 completion_tokens=23 elapsed=9.3s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6484 completion_tokens=30 elapsed=11.8s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6543 completion_tokens=20 elapsed=8.2s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6584 completion_tokens=30 elapsed=11.9s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6651 completion_tokens=41 elapsed=15.9s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6726 completion_tokens=23 elapsed=9.3s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6786 completion_tokens=46 elapsed=17.7s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6878 completion_tokens=19 elapsed=7.9s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6911 completion_tokens=28 elapsed=11.2s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=6970 completion_tokens=20 elapsed=8.2s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=7011 completion_tokens=34 elapsed=13.4s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=7099 completion_tokens=120 elapsed=44.9s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3547 completion_tokens=35 elapsed=13.5s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3616 completion_tokens=46 elapsed=17.4s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3692 completion_tokens=20 elapsed=7.9s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3733 completion_tokens=19 elapsed=7.6s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3781 completion_tokens=27 elapsed=10.5s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3836 completion_tokens=29 elapsed=11.2s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3890 completion_tokens=29 elapsed=11.3s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3948 completion_tokens=19 elapsed=7.6s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=3988 completion_tokens=19 elapsed=7.6s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4028 completion_tokens=20 elapsed=8.0s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4069 completion_tokens=19 elapsed=7.7s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4110 completion_tokens=28 elapsed=11.1s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4183 completion_tokens=46 elapsed=17.8s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4250 completion_tokens=29 elapsed=11.5s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
COMPLETION_GENERATION_COMPLETE prompt_tokens=4324 completion_tokens=61 elapsed=23.4s
INFO: 127.0.0.1:53990 - "POST /v1/completions HTTP/1.1" 200 OK
|