| --- |
| language: |
| - ko |
| pipeline_tag: image-to-text |
| --- |
| |
| # **deplot_kr** |
| |
| deplot_kr is a Image-to-Data(Text) model based on the google's pix2struct architecture. |
| It was fine-tuned from [DePlot](https://huggingface.co/google/deplot), using korean chart image-text pairs. |
| |
| deplot_kr์ google์ pix2struct ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก ํ ํ๊ตญ์ด image-to-data(ํ
์คํธ ํํ์ ๋ฐ์ดํฐ ํ
์ด๋ธ) ๋ชจ๋ธ์
๋๋ค. |
| [DePlot](https://huggingface.co/google/deplot) ๋ชจ๋ธ์ ํ๊ตญ์ด ์ฐจํธ ์ด๋ฏธ์ง-ํ
์คํธ ์ ๋ฐ์ดํฐ์ธํธ(30๋ง ๊ฐ)๋ฅผ ์ด์ฉํ์ฌ fine-tuning ํ์ต๋๋ค. |
| |
| ## **How to use** |
| |
| You can run a prediction by input an image. |
| Model predict the data table of text form in the image. |
| |
| ์ด๋ฏธ์ง๋ฅผ ๋ชจ๋ธ์ ์
๋ ฅํ๋ฉด ๋ชจ๋ธ์ ์ด๋ฏธ์ง๋ก๋ถํฐ ํ ํํ์ ๋ฐ์ดํฐ ํ
์ด๋ธ์ ์์ธกํฉ๋๋ค. |
| |
| ```python |
| from transformers import Pix2StructForConditionalGeneration, AutoProcessor |
| from PIL import Image |
| |
| processor = AutoProcessor.from_pretrained("brainventures/deplot_kr") |
| model = Pix2StructForConditionalGeneration.from_pretrained("brainventures/deplot_kr") |
| |
| image_path = "IMAGE_PATH" |
| image = Image.open(image_path) |
| |
| inputs = processor(images=image, return_tensors="pt") |
| pred = model.generate(flattened_patches=flattened_patches, attention_mask=attention_mask, max_length=1024) |
| print(processor.batch_decode(deplot_generated_ids, skip_special_token=True)[0]) |
| |
| ``` |
| |
| **Model Input Image** |
|  |
| |
| **Model Output - Prediction** |
| |
| ๋์: |
| ์ ๋ชฉ: 2011-2021 ๋ณด๊ฑด๋ณต์ง ๋ถ์ผ ์ผ์๋ฆฌ์ <unk>์ฆ |
| ์ ํ: ๋จ์ผํ ์ผ๋ฐ ์ธ๋ก <unk>๋ํ |
| | ๋ณด๊ฑด(์ฒ ๋ช
) | ๋ณต์ง(์ฒ ๋ช
) |
| 1๋ถ์ | 29.7 | 178.4 |
| 2๋ถ์ | 70.8 | 97.3 |
| 3๋ถ์ | 86.4 | 61.3 |
| 4๋ถ์ | 28.2 | 16.0 |
| 5๋ถ์ | 52.3 | 0.9 |
| |
| |
| |
| ### **Preprocessing** |
| |
| According to [Liu et al.(2023)](https://arxiv.org/pdf/2212.10505.pdf)... |
| |
| - markdown format |
| - | : seperating cells (์ด ๊ตฌ๋ถ) |
| - \n : seperating rows (ํ ๊ตฌ๋ถ) |
| |
| |
| ### **Train** |
| |
| The model was trained in a TPU environment. |
| - num_warmup_steps : 1,000 |
| - num_training_steps : 40,000 |
| |
| ## **Evaluation Results** |
| |
| This model achieves the following results: |
| |
| |metrics name | % | |
| |:---|---:| |
| | RNSS (Relative Number Set Similarity)| 98.1615 | |
| |RMS (Relative Mapping Similarity) Precision | 83.1615 | |
| |RMS Recall | 26.3549 | |
| | RMS F1 Score | 31.5633 | |
| |
| ## Contact |
| |
| For questions and comments, please use the discussion tab or email gloria@brainventur.com |