DeepSeek-OCR 2: Visual Causal Flow
Paper • 2601.20552 • Published • 68
LoRA adapter fine-tuned from deepseek-ai/DeepSeek-OCR-2 for Urdu OCR on a small subset of PuristanLabs1/urdu-ocr-1M.
deepseek-ai/DeepSeek-OCR-2nastaliq80080This is a small adapter-focused experiment meant to improve Urdu transcription quality without uploading a full model checkpoint.
This repo is an adapter repo. You can load it directly and PEFT will attach the base model automatically from the adapter config.
import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer
MODEL_ID = "kingabzpro/deepseek-ocr-2-urdu-ocr-1m-lora"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoPeftModelForCausalLM.from_pretrained(
MODEL_ID,
trust_remote_code=True,
torch_dtype=torch.bfloat16,
use_safetensors=True,
_attn_implementation="flash_attention_2",
)
model = model.eval().cuda()
model.config.use_cache = True
prompt = "<image>\nFree OCR. "
result = model.infer(
tokenizer,
prompt=prompt,
image_file="sample.png",
output_path="ocr-output",
base_size=1024,
image_size=768,
crop_mode=True,
save_results=False,
eval_mode=True,
)
print(result)
1e-4100.01LoRA target modules:
q_projkv_a_proj_with_mqakv_b_projo_projgate_projup_projdown_projTwo example comparisons from the validation subset:
| Sample | Base CER | Finetuned CER |
|---|---|---|
| 21 | 0.6290 | 0.0806 |
| 53 | 1.5385 | 0.3846 |
Detailed examples:
sample_index: 21
reference : آنے والے شخص نے اپنا تعارف کرواتے ہوئے کہا:”میرا نام شہزاد ہے۔
before : ۱- وله شخص از پشت رفت و آمد و به "میرادام" بشارد.
after : آئے والے شخص نے اپنا تعارف کرواتے ہوئے کہا: ”میں ہام شہزاد ہے۔
sample_index: 53
reference : آپﷺ نے فرمایا کہ اے انجشہ!
before : 1 - في كتابة النص، هل تسبيماً ماكسراً؟ اكتب ثابتاً!
after : آپ ﷺ نے فسر ملیا کرا اے اچھٹ !
These examples improved clearly, but this is still a small-sample run and should not be treated as a full benchmark.
800/80 samplesPlease cite the base model and dataset.
@article{wei2026deepseek,
title={DeepSeek-OCR 2: Visual Causal Flow},
author={Wei, Haoran and Sun, Yaofeng and Li, Yukun},
journal={arXiv preprint arXiv:2601.20552},
year={2026}
}
Base model
deepseek-ai/DeepSeek-OCR-2