YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Info
This is a heavily fine-tuned variant of LFM2-VL-1.6B for OCR'ing Chinese text in images.
Expected input format
This model expects a single-turn conversation, where the user inputs a specific text instruction followed by the image, e.g:
messages = [
{
'role': 'user',
'content': [
{
'type': 'text',
'text': "OCR the Chinese text in this image without any explanation.",
},
{
'type': 'image_url',
'image_url': {
'url': image_to_base64(image),
}
}
]
}
]
This model was NOT trained to OCR entire pages of text as-is. For best results pass in an image containing a single line of text.
- Downloads last month
- 6
Hardware compatibility
Log In to add your hardware
We're not able to determine the quantization variants.
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support