Please correct config.json
#8
by Geonmo - opened
The presence of projection_dim is required but missing in the current implementation. Consequently, the utilization of CLIPVisionModelWithProjection and CLIPTextModelWithProjection is currently unavailable.
Geonmo changed pull request title from Update config.json to Please correct config.json
import torch
from transformers import CLIPVisionModelWithProjection
model_name = 'laion/CLIP-ViT-bigG-14-laion2B-39B-b160k'
clip_vision_model = CLIPVisionModelWithProjection.from_pretrained(model_name, torch_dtype=torch.float16)
then it will happen
`text_config_dict` is provided which will be used to initialize `CLIPTextConfig`. The value `text_config["id2label"]` will be overriden.
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 2/2 [00:07<00:00, 3.51s/it]
RuntimeError: Error(s) in loading state_dict for CLIPVisionModelWithProjection:
size mismatch for visual_projection.weight: copying a param with shape torch.Size([1280, 1664]) from checkpoint, the shape in current model is torch.Size([512, 1664]).
You may consider adding `ignore_mismatched_sizes=True` in the model `from_pretrained` method.
fixed
rwightman changed pull request status to closed