YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Info

This is a heavily fine-tuned variant of LFM2-VL-1.6B for OCR'ing Chinese text in images.

Expected input format

This model expects a single-turn conversation, where the user inputs a specific text instruction followed by the image, e.g:

messages = [
    {
        'role': 'user',
        'content': [
            { 
                'type': 'text',
                'text': "OCR the Chinese text in this image without any explanation.",
            },
            {
                'type': 'image_url',
                'image_url': {
                    'url': image_to_base64(image),
                }
            }
        ]
    }
]

This model was NOT trained to OCR entire pages of text as-is. For best results pass in an image containing a single line of text.

Downloads last month
6
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support