SD-XL 1.0-base GGUF Model Card

Quantized versions of stabilityai/stable-diffusion-xl-base-1.0 in GGUF format for use with stable-diffusion.cpp.

At the time of publishing, no ready-made GGUF weights for SDXL were available for sd.cpp runtime — so here we are.

Sample generation: "A lovely cat" · seed 357925 · Q8_0 · 1024×1024

Available Quantizations

File	Type	Description
`sd_xl_base_1.0_0_bf16.gguf`	BF16	Best quality, largest size
`sd_xl_base_1.0_0_Q8_0.gguf`	Q8_0	Great balance of quality and size ✅ recommended
`sd_xl_base_1.0_0_Q4_K.gguf`	Q4_K	Smaller size, good quality
`sd_xl_base_1.0_0_Q4_0.gguf`	Q4_0	Smallest size

Quick Start

1. Download the model

# Recommended — Q8_0
wget https://huggingface.co/kostakoff/stable-diffusion-xl-base-1.0-GGUF/resolve/main/sd_xl_base_1.0_0_Q8_0.gguf

# Other quantizations:
# wget https://huggingface.co/kostakoff/stable-diffusion-xl-base-1.0-GGUF/resolve/main/sd_xl_base_1.0_0_bf16.gguf
# wget https://huggingface.co/kostakoff/stable-diffusion-xl-base-1.0-GGUF/resolve/main/sd_xl_base_1.0_0_Q4_K.gguf
# wget https://huggingface.co/kostakoff/stable-diffusion-xl-base-1.0-GGUF/resolve/main/sd_xl_base_1.0_0_Q4_0.gguf

2. Build stable-diffusion.cpp

Requirements: CUDA-capable GPU, CMake ≥ 3.18, CUDA Toolkit

git clone https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp
git submodule init
git submodule update

mkdir build && cd build
cmake .. -DSD_CUDA=ON
cmake --build . --config Release

Version used for conversion and testing:

stable-diffusion.cpp version master-520-d950627, commit d950627

3. Start the server

export CUDA_VISIBLE_DEVICES=0

./stable-diffusion.cpp/build/bin/sd-server \
  -m ./sd_xl_base_1.0_0_Q8_0.gguf \
  --vae-on-cpu \
  --listen-ip 0.0.0.0 \
  --listen-port 8081

⚠️ The --vae-on-cpu flag is required! The VAE decoder consumes up to 10 GB of VRAM when converting the latent representation to PNG. Offloading VAE to CPU makes it possible to run the model on most consumer GPUs.

4. Generate an image

curl -s http://127.0.0.1:8081/v1/images/generations \
  -H "Content-Type: application/json" \
  -d '{
    "model": "sdxl",
    "prompt": "A lovely cat<sd_cpp_extra_args>{\"seed\": 357925}</sd_cpp_extra_args>",
    "n": 1,
    "size": "1024x1024",
    "response_format": "b64_json"
  }' | jq -r '.data[0].b64_json' | base64 --decode > out.png

Extra parameters are passed via <sd_cpp_extra_args> as a JSON snippet embedded directly in the prompt field.

How the weights were created

Converted from the original sd_xl_base_1.0_0.9vae.safetensors weights using the built-in sd-cli conversion tool:

# Q8_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sdxl/sd_xl_base_1.0_0.9vae.safetensors \
  -o ./sd_xl_base_1.0_0_Q8_0.gguf -v --type q8_0

# BF16
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sdxl/sd_xl_base_1.0_0.9vae.safetensors \
  -o ./sd_xl_base_1.0_0_bf16.gguf -v --type bf16

# Q4_0
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sdxl/sd_xl_base_1.0_0.9vae.safetensors \
  -o ./sd_xl_base_1.0_0_Q4_0.gguf -v --type q4_0

# Q4_K
./stable-diffusion.cpp/build/bin/sd-cli -M convert \
  -m ~/llm/models/sdxl/sd_xl_base_1.0_0.9vae.safetensors \
  -o ./sd_xl_base_1.0_0_Q4_K.gguf -v --type q4_K

License

This model inherits the license of the original — CreativeML Open RAIL++-M

Downloads last month: 427

GGUF

Model size

3B params

Architecture

Hardware compatibility

4-bit

8-bit

16-bit

Model tree for kostakoff/stable-diffusion-xl-base-1.0-GGUF

Base model

stabilityai/stable-diffusion-xl-base-1.0

Quantized

(22)

this model

Collection including kostakoff/stable-diffusion-xl-base-1.0-GGUF

Forks

Collection

My forks, when I need modify something in original model • 5 items • Updated Mar 16