UVDoc
Introduction
The main purpose of text image correction is to carry out geometric transformation on the image to correct the document distortion, inclination, perspective deformation and other problems in the image, so that the subsequent text recognition can be more accurate.
| Model | CER |
|---|---|
| UVDoc | 0.179 |
Note: Test data set: docunet benchmark data set.
Model Usage
import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModel
model_path = "PaddlePaddle/UVDoc_safetensors"
model = AutoModel.from_pretrained(model_path, device_map="auto")
image_processor = AutoImageProcessor.from_pretrained(model_path)
image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/doc_test.jpg", stream=True).raw)
inputs = image_processor(images=image, return_tensors="pt").to(model.device)
outputs = model(**inputs)
result = image_processor.post_process_document_rectification(outputs.last_hidden_state, inputs["original_images"])
print(result)
- Downloads last month
- 93