can you please provide an example?

by gabbo1995 - opened Jan 5

Discussion

gabbo1995

Jan 5

Could you please provide a running example code with diffusers to use this model? Thank you so much.

OzzyGT

Owner Jan 5

•

edited Jan 5

sure, I'm doing a diffusers recipe repo here but I still haven't had the time to update it with this model, so here's a code example in the meantime:

import torch
from diffusers import QwenImagePipeline, QwenImageTransformer2DModel
from transformers import Qwen2_5_VLForConditionalGeneration


torch_dtype = torch.bfloat16

transformer = QwenImageTransformer2DModel.from_pretrained(
    "OzzyGT/Qwen-Image-2512-bnb-4bit-transformer", torch_dtype=torch_dtype, device_map="cpu"
)
text_encoder = Qwen2_5_VLForConditionalGeneration.from_pretrained(
    "OzzyGT/Qwen-Image-2512-bnb-4bit-text-encoder", torch_dtype=torch_dtype, device_map="cpu"
)

pipe = QwenImagePipeline.from_pretrained(
    "Qwen/Qwen-Image-2512", transformer=transformer, text_encoder=text_encoder, torch_dtype=torch_dtype
)
pipe.enable_model_cpu_offload()

prompt = """A photograph that captures a young woman on a city rooftop, with a hazy city skyline in the background. She has long, dark hair that naturally drapes over her shoulders and is wearing a simple tank top. Her posture is relaxed, with her hands resting on the railing in front of her, leaning slightly forward as she looks directly into the camera. The sunlight, coming from behind her at an angle, creates a soft backlight effect that casts a warm golden halo around the edges of her hair and shoulders. This light also produces a slight lens flare, adding a dreamy quality to the image. The city buildings in the background are blurred by the backlight, emphasizing the main subject. The overall tone is warm, evoking a sense of tranquility and a hint of melancholy."""
negative_prompt = "低分辨率，低画质，肢体畸形，手指畸形，画面过饱和，蜡像感，人脸无细节，过度光滑，画面具有AI感。构图混乱。文字模糊，扭曲。"

image = pipe(
    prompt=prompt,
    negative_prompt=negative_prompt,
    width=1664,
    height=928,
    num_inference_steps=28,
    true_cfg_scale=4.0,
    generator=torch.Generator(device="cuda").manual_seed(42),
).images[0]

image.save("qwen-image_output.png")

rahul7star

Jan 6

Does it really work in CPU mode?

OzzyGT

Owner Jan 6

you're assuming something wrong, that's the loading, not the inference. Using pipe.enable_model_cpu_offload() makes it move the model to the GPU when is needed at inference.

rahul7star

Jan 6

No what I meant is bitsandbyte will need GPU so the whole code base will not run in CPU (without CUDA) . All good

OzzyGT

Owner Jan 7

not sure I understand your issue, but glad its all good.

Just in case, bnb loads the models to the GPU but here I'm assuming that if you're using quantization it is because you don't have enough VRAM, so the way to overcome this is to load one model (which will load to GPU) and move it to the "cpu" to free the VRAM, and then load the other in the same way. After you do this, the cpu offloading works and it's in charge of moving the respective models to "cpu" and "gpu" when needed.

gabbo1995

Jan 7

Thank you so much!!

rahul7star

Jan 11

•

edited Jan 11

OzzyGT/Qwen-Image-2512-bnb-4bit-transformer how did you quant that ? I dont see AItoolkit watermark , is it just using bnb to get the quant for any specific framework used ?

OzzyGT

Owner Jan 23

I used diffusers and bnb, no additional framework

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment