KeyError when trying to use the model
I am trying to follow the instructions on huggingface to load the model:
from transformers import AutoTokenizer, AutoModelForVision2Seq
tokenizer = AutoTokenizer.from_pretrained("dh-unibe/trocr-kurrent")
model = AutoModelForVision2Seq.from_pretrained("dh-unibe/trocr-kurrent")
But I always get
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/processing_auto.py", line 264, in from_pretrained
return processor_class.from_pretrained(
File "/usr/local/lib/python3.10/dist-packages/transformers/processing_utils.py", line 184, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/transformers/processing_utils.py", line 228, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/tokenization_auto.py", line 674, in from_pretrained
tokenizer_class_py, tokenizer_class_fast = TOKENIZER_MAPPING[type(config)]
File "/usr/local/lib/python3.10/dist-packages/transformers/models/auto/auto_factory.py", line 599, in __getitem__
raise KeyError(key)
KeyError: <class 'transformers.models.vision_encoder_decoder.configuration_vision_encoder_decoder.VisionEncoderDecoderConfig'>
Am i doing something wrong?
Sorry it does not work - it's not your fault!
It's a problem of version. The tokenizer/processor part of this model is too old.
Files are missing. I tried to add tokenizer_config.json, now vocab.json is still missing.
And I'm not sure, if makes sense to copy this from another model (like dh-unibe/trocr-kurrent-XVI-XVII).
Also downgrading transformers(the version, this model has been trained with is 4.26.0), torch, or python did not work.
You can try to use the base-model (or the one from kurrent-XVI-XVII) processor. The VisionEncoderDecoderModel can be loaded.
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
processor = TrOCRProcessor.from_pretrained("microsoft/trocr-base-handwritten")
model = VisionEncoderDecoderModel.from_pretrained("dh-unibe/trocr-kurrent")
Tbh. I'm not sure about the influence of the inference outcome.
For sure we'll have to train a newer version with the training material we have.
Thanks for your reply!
Yeah, I figured that it was an issue with the library versions, I tried transformers 4.19, 4.26 and 4.55, with the same results.
I will take a look at your proposed workaround this weekend, maybe the results are good enough for me!