Lots of CLIP(also the unet?) keys missing/failed to load though the output is totally fine...

by ReloadProcz103 - opened Jan 8

Jan 8

In ComfyUI Terminal:

got prompt
Found quantization metadata version 1
Detected mixed precision quantization
Using mixed precision operations
model weight dtype torch.bfloat16, manual cast: torch.bfloat16
model_type FLUX
unet unexpected: [(lots of keys)]
VAE load device: cuda:0, offload device: cpu, dtype: torch.bfloat16
no CLIP/text encoder weights in checkpoint, the text encoder model will not be loaded.
clip missing: ['gemma3_12b.logit_scale',...,(lots of keys missing because failed to load?)]

However the output is totally fine. Is there any way to let those annoying unet/clip keys missing/not loaded problem go away?

GitMylo

Owner Jan 8

Not entirely sure, the gemma 3 model in this repo has the exact same weight names as the one from the original comfy repo, the only difference being the precision of the attention and mlp weight layers being fp8_e4m3fn. If you're using the normal comfy workflow with this fp8 model instead of the fp16 one, the errors will probably show up with the fp16 version as well.

The way the checkpoint loader node works is by opening up the file and searching for the available submodels (diffusion model, clip, vae), the ltx 2 checkpoint includes the vae but not the text encoder, so it's giving a warning about that. This might be resolved in the future with a combined model (with clip included in the checkpoint), or separate models (like how most people run other recent models like wan, flux, etc)

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment