Urdu Poetry TrOCR (Iqbal Edition)
This model is a fine-tuned version of TrOCR (Transformer-based Optical Character Recognition) specifically optimized for Urdu Nastaliq script. It was trained on a specialized dataset of poetry by Allama Iqbal to master the complex ligatures, overlapping characters, and right-to-left (RTL) flow of classical Urdu calligraphy.
🚀 Model Evolution & Performance
This final version (V2) represents a major breakthrough in handling Urdu cursive script. By optimizing the vision-to-text alignment, we have successfully resolved common OCR issues such as:
- Reading Direction: Correctly processes RTL text flow.
- Word Continuity: Eliminates "split words" and random character insertions.
- Poetic Coherence: Transcribes full couplets with high linguistic accuracy.
📊 Visual Performance Gallery (Sample Results)
🛠️ Usage & Implementation
To achieve the high-fidelity results shown above, we recommend using the following inference configuration.
Python Example
from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image
import torch
# Load the fine-tuned model
processor = TrOCRProcessor.from_pretrained("Khurram123/urdu-poetry-trocr-iqbal")
model = VisionEncoderDecoderModel.from_pretrained("Khurram123/urdu-poetry-trocr-iqbal")
# Load image and prepare pixels
image = Image.open("sample_poetry.jpg").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
# Optimized generation parameters for Urdu Nastaliq
generated_ids = model.generate(
pixel_values,
max_length=128,
num_beams=7, # Higher beams for complex ligature search
repetition_penalty=3.0, # Prevents character looping
length_penalty=1.5, # Encourages completion of full poetic lines
early_stopping=False
)
# Decode output
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(f"OCR Result: {transcription}")
- Downloads last month
- 106





