GOT-OCR 2.0 โ€” Hindi Fine-Tuned (CPU-Compatible)

This is a fine-tuned version of GOT-OCR 2.0 specifically trained for Hindi text extraction from images.

Key Features

  • ๐Ÿ‡ฎ๐Ÿ‡ณ Hindi OCR: Fine-tuned on 80K Hindi synthetic line image-text pairs
  • ๐Ÿ’ป CPU Compatible: All CUDA dependencies removed, runs on CPU
  • ๐Ÿ”€ LoRA Merged: Standalone model, no adapter loading needed
  • ๐ŸŽจ LoRA Config: rank=8, alpha=32, dropout=0.05

Usage

from transformers import AutoModel, AutoTokenizer
import torch

model_name = "Solo448/GOT-2.0-hindi_CPU"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(
    model_name,
    trust_remote_code=True,
    low_cpu_mem_usage=True,
    torch_dtype=torch.float32
).eval()

result = model.chat(tokenizer, "path/to/hindi_image.png", ocr_type="ocr")
print(result)

Training Details

  • Base Model: stepfun-ai/GOT-OCR2_0
  • Training Framework: ms-swift (ModelScope Swift)
  • Fine-Tuning Method: LoRA (rank=8, alpha=32)
  • Dataset: Hindi OCR Synthetic Line Image-Text Pair
  • Training Steps: 4000
  • Hardware: Kaggle T4 GPU

Limitations

  • Primarily optimized for printed Hindi text (not handwritten)
  • CPU inference is slower than GPU (~10-30s per image)
  • Best results on clean, high-contrast document images
Downloads last month
161
Safetensors
Model size
0.6B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Solo448/GOT-2-hindi_CPU

Finetuned
(6)
this model

Space using Solo448/GOT-2-hindi_CPU 1