UVDoc

Introduction

The main purpose of text image correction is to carry out geometric transformation on the image to correct the document distortion, inclination, perspective deformation and other problems in the image, so that the subsequent text recognition can be more accurate.

Model	CER
UVDoc	0.179

Note: Test data set: docunet benchmark data set.

Model Usage

import requests
from PIL import Image
from transformers import AutoImageProcessor, AutoModel

model_path = "PaddlePaddle/UVDoc_safetensors"
model = AutoModel.from_pretrained(model_path, device_map="auto")
image_processor = AutoImageProcessor.from_pretrained(model_path)

image = Image.open(requests.get("https://paddle-model-ecology.bj.bcebos.com/paddlex/imgs/demo_image/doc_test.jpg", stream=True).raw)

inputs = image_processor(images=image, return_tensors="pt").to(model.device)
outputs = model(**inputs)

result = image_processor.post_process_document_rectification(outputs.last_hidden_state, inputs["original_images"])
print(result)

Downloads last month: 93