Instructions to use SynLayers/Bbox-caption-8b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use SynLayers/Bbox-caption-8b with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("image-text-to-text", model="SynLayers/Bbox-caption-8b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
pipe(text=messages)

# Load model directly
from transformers import AutoProcessor, AutoModelForImageTextToText

processor = AutoProcessor.from_pretrained("SynLayers/Bbox-caption-8b")
model = AutoModelForImageTextToText.from_pretrained("SynLayers/Bbox-caption-8b")
messages = [
    {
        "role": "user",
        "content": [
            {"type": "image", "url": "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/p-blog/candy.JPG"},
            {"type": "text", "text": "What animal is on the candy?"}
        ]
    },
]
inputs = processor.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(processor.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use SynLayers/Bbox-caption-8b with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "SynLayers/Bbox-caption-8b"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SynLayers/Bbox-caption-8b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker

docker model run hf.co/SynLayers/Bbox-caption-8b

SGLang

How to use SynLayers/Bbox-caption-8b with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "SynLayers/Bbox-caption-8b" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SynLayers/Bbox-caption-8b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "SynLayers/Bbox-caption-8b" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "SynLayers/Bbox-caption-8b",
		"messages": [
			{
				"role": "user",
				"content": [
					{
						"type": "text",
						"text": "Describe this image in one sentence."
					},
					{
						"type": "image_url",
						"image_url": {
							"url": "https://cdn.britannica.com/61/93061-050-99147DCE/Statue-of-Liberty-Island-New-York-Bay.jpg"
						}
					}
				]
			}
		]
	}'

Docker Model Runner
How to use SynLayers/Bbox-caption-8b with Docker Model Runner:
```
docker model run hf.co/SynLayers/Bbox-caption-8b
```

Bbox-caption-8b / SynLayers_checkpoints /FLUX.1-dev /README.md

SynLayers

Upload folder using huggingface_hub

205434c verified 7 days ago

preview code

raw

history blame contribute delete

4.39 kB

	---
	language:
	- en
	license: other
	license_name: flux-1-dev-non-commercial-license
	license_link: LICENSE.md
	extra_gated_prompt: By clicking "Agree", you agree to the [FluxDev Non-Commercial License Agreement](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md)
	and acknowledge the [Acceptable Use Policy](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/POLICY.md).
	tags:
	- text-to-image
	- image-generation
	- flux
	---

	![FLUX.1 [dev] Grid](./dev_grid.jpg)

	`FLUX.1 [dev]` is a 12 billion parameter rectified flow transformer capable of generating images from text descriptions.
	For more information, please read our [blog post](https://blackforestlabs.ai/announcing-black-forest-labs/).

	# Key Features
	1. Cutting-edge output quality, second only to our state-of-the-art model `FLUX.1 [pro]`.
	2. Competitive prompt following, matching the performance of closed source alternatives .
	3. Trained using guidance distillation, making `FLUX.1 [dev]` more efficient.
	4. Open weights to drive new scientific research, and empower artists to develop innovative workflows.
	5. Generated outputs can be used for personal, scientific, and commercial purposes as described in the [`FLUX.1 [dev]` Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md).

	# Usage
	We provide a reference implementation of `FLUX.1 [dev]`, as well as sampling code, in a dedicated [github repository](https://github.com/black-forest-labs/flux).
	Developers and creatives looking to build on top of `FLUX.1 [dev]` are encouraged to use this as a starting point.

	## API Endpoints
	The FLUX.1 models are also available via API from the following sources
	- [bfl.ml](https://docs.bfl.ml/) (currently `FLUX.1 [pro]`)
	- [replicate.com](https://replicate.com/collections/flux)
	- [fal.ai](https://fal.ai/models/fal-ai/flux/dev)
	- [mystic.ai](https://www.mystic.ai/black-forest-labs/flux1-dev)

	## ComfyUI
	`FLUX.1 [dev]` is also available in [Comfy UI](https://github.com/comfyanonymous/ComfyUI) for local inference with a node-based workflow.

	## Diffusers

	To use `FLUX.1 [dev]` with the 🧨 diffusers python library, first install or upgrade diffusers

	```shell
	pip install -U diffusers
	```

	Then you can use `FluxPipeline` to run the model

	```python
	import torch
	from diffusers import FluxPipeline

	pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
	pipe.enable_model_cpu_offload() #save some VRAM by offloading the model to CPU. Remove this if you have enough GPU power

	prompt = "A cat holding a sign that says hello world"
	image = pipe(
	prompt,
	height=1024,
	width=1024,
	guidance_scale=3.5,
	num_inference_steps=50,
	max_sequence_length=512,
	generator=torch.Generator("cpu").manual_seed(0)
	).images[0]
	image.save("flux-dev.png")
	```

	To learn more check out the [diffusers](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux) documentation

	---
	# Limitations
	- This model is not intended or able to provide factual information.
	- As a statistical model this checkpoint might amplify existing societal biases.
	- The model may fail to generate output that matches the prompts.
	- Prompt following is heavily influenced by the prompting-style.

	# Out-of-Scope Use
	The model and its derivatives may not be used

	- In any way that violates any applicable national, federal, state, local or international law or regulation.
	- For the purpose of exploiting, harming or attempting to exploit or harm minors in any way; including but not limited to the solicitation, creation, acquisition, or dissemination of child exploitative content.
	- To generate or disseminate verifiably false information and/or content with the purpose of harming others.
	- To generate or disseminate personal identifiable information that can be used to harm an individual.
	- To harass, abuse, threaten, stalk, or bully individuals or groups of individuals.
	- To create non-consensual nudity or illegal pornographic content.
	- For fully automated decision making that adversely impacts an individual's legal rights or otherwise creates or modifies a binding, enforceable obligation.
	- Generating or facilitating large-scale disinformation campaigns.

	# License
	This model falls under the [`FLUX.1 [dev]` Non-Commercial License](https://huggingface.co/black-forest-labs/FLUX.1-dev/blob/main/LICENSE.md).