GOT-OCR 2.0 — Hindi Fine-Tuned (CPU-Compatible)

This is a fine-tuned version of GOT-OCR 2.0 specifically trained for Hindi text extraction from images.

Key Features

🇮🇳 Hindi OCR: Fine-tuned on 80K Hindi synthetic line image-text pairs
💻 CPU Compatible: All CUDA dependencies removed, runs on CPU
🔀 LoRA Merged: Standalone model, no adapter loading needed
🎨 LoRA Config: rank=8, alpha=32, dropout=0.05

Usage

from transformers import AutoModel, AutoTokenizer
import torch

model_name = "Solo448/GOT-2.0-hindi_CPU"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(
    model_name,
    trust_remote_code=True,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float32
).eval()

result = model.chat(tokenizer, "path/to/hindi_image.png", ocr_type="ocr")
print(result)

Training Details

Base Model: stepfun-ai/GOT-OCR2_0
Training Framework: ms-swift (ModelScope Swift)
Fine-Tuning Method: LoRA (rank=8, alpha=32)
Dataset: Hindi OCR Synthetic Line Image-Text Pair
Training Steps: 4000
Hardware: Kaggle T4 GPU

Limitations

Primarily optimized for printed Hindi text (not handwritten)
CPU inference is slower than GPU (~10-30s per image)
Best results on clean, high-contrast document images

Downloads last month: 161

Safetensors

Model size

0.6B params

Tensor type

BF16

Inference Providers NEW

Image-Text-to-Text

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Solo448/GOT-2-hindi_CPU

Base model

stepfun-ai/GOT-OCR2_0

Finetuned

(6)

this model