TachiwinOCR 1.5 GGUF 🦑

for the Indigenous Languages of Mexico

This is a PaddleOCR-VL Finetune specialized in the 68 indigenous languages of Mexico and their diverse character and glyph repertoire making a world first in tech access and linguistic rights

Inference

You can perform inference using the PaddleOCR pipeline or the transformers library.

Option A: Using PaddleOCR

from paddleocr import PaddleOCRVL

# Load the fine-tuned model
pipeline = PaddleOCRVL(
    vl_rec_model_name="tachiwin/Tachiwin-OCR-1.5",
    vl_rec_model_dir=path_to_tachiwin_downloaded_model,
)

# Predict on an image
output = pipeline.predict("test.png")

for res in output:
    res.print()
    res.save_to_json(save_path="output")
    res.save_to_markdown(save_path="output")

Option B: Using Transformers

from PIL import Image
import torch
from transformers import AutoModelForCausalLM, AutoProcessor

MODEL = "tachiwin/Tachiwin-OCR-1.5"
image_path = "my_image.png"

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"

image = Image.open(image_path).convert("RGB")

model = AutoModelForCausalLM.from_pretrained(
    MODEL,
    trust_remote_code=True,
    torch_dtype=torch.bfloat16
).to(DEVICE).eval()
processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)

messages = [
    {"role": "user", "content": [
        {"type": "image", "image": image},
        {"type": "text", "text": "OCR:"},
    ]}
]

inputs = processor.apply_chat_template(
    messages, 
    tokenize=True, 
    add_generation_prompt=True, 	
    return_dict=True,
    return_tensors="pt"
).to(DEVICE)

outputs = model.generate(**inputs, max_new_tokens=1024, min_new_tokens=1)
generated_text = processor.batch_decode(outputs, skip_special_tokens=True)[0]

print(generated_text)

πŸ“Š Benchmark Results

Tachiwin-OCR 1.5 was evaluated against the base PaddleOCR-VL 1.5 model using a diverse subset of Indigenous language samples. The fine-tuning results demonstrate dramatic improvements in both character and word recognition accuracy β€” far surpassing the gains seen in version 1.0.

Summary Metrics

Metric Base Model (Raw) Tachiwin-OCR 1.5 (Fine-tuned) Improvement
Character Error Rate (CER) 17.65% 2.03% 88.5% (Relative Reduction)
Word Error Rate (WER) 38.59% 3.60% 90.7% (Relative Reduction)
OCR Accuracy (1 βˆ’ CER) 82.35% 97.97% +15.61pp (Absolute)
Word Accuracy (1 βˆ’ WER) 61.41% 96.40% +34.99pp (Absolute)

Version Comparison: 1.0 β†’ 1.5

Metric Tachiwin-OCR v1.0 Tachiwin-OCR v1.5 Ξ” Change
CER 6.80% 2.03% βˆ’4.77pp
WER 17.36% 3.60% βˆ’13.76pp
Accuracy (1 βˆ’ CER) 93.20% 97.97% +4.77pp
Word Accuracy (1 βˆ’ WER) 82.64% 96.40% +13.76pp
Relative CER Reduction 10.4% 88.5% +78.1pp
Relative WER Reduction 31.0% 90.7% +59.7pp

Detailed Comparison β€” v1.5 Sample Results

Results across 21 language samples. Languages with tonal or complex diacritic systems show the most dramatic improvements:

# Language Code Raw CER FT CER Raw WER FT WER CER Improvement
0 zpo (Zapotec) 0.24% 0.00% 1.12% 0.00% +0.24%
1 maz (Central Mazahua) 0.41% 0.00% 2.27% 0.00% +0.41%
2 zao (Zapotec) 6.18% 3.49% 23.61% 12.50% +2.69%
3 mat (Matlatzinca) 6.51% 0.00% 42.55% 0.00% +6.51%
4 amu (Amuzgo) 85.52% 0.00% 89.13% 0.00% +85.52%
5 mxp (Mixe) 15.91% 11.87% 54.90% 9.80% +4.04%
6 yaq (Yaqui) 1.82% 0.00% 3.12% 0.00% +1.82%
7 poe (Popoloca) 6.78% 3.39% 62.50% 12.50% +3.39%
8 zpc (Zapotec) 9.43% 2.05% 42.11% 13.16% +7.38%
9 sei (Seri) 1.89% 0.00% 10.61% 0.00% +1.89%
10 lac (Lacandon) 9.80% 0.00% 42.31% 0.00% +9.80%
11 zao (Zapotec) 93.01% 0.00% 100.00% 0.00% +93.01%
12 mxt (Mixtec) 6.70% 0.00% 19.18% 0.00% +6.70%
13 huv (San Marcos Huistepec Zapotec) 1.41% 0.00% 10.34% 0.00% +1.41%
14 tee (Huehuetla Tepehua) 3.03% 0.00% 17.33% 0.00% +3.03%
15 tzh (Tzeltal) 2.67% 0.00% 15.91% 0.00% +2.67%
16 mto (Totontepec Mixe) 93.12% 32.47% 100.00% 39.71% +60.65%
17 amu (Amuzgo) 14.96% 2.36% 52.46% 1.64% +12.60%
18 mih (Chayuco Mixtec) 3.76% 0.00% 9.52% 0.00% +3.76%
19 zpm (Mixtec) 6.98% 0.00% 32.73% 0.00% +6.98%
20 toc (Tojolabal) 11.32% 0.00% 57.14% 0.00% +11.32%
β€” AVERAGE 17.65% 2.03% 38.59% 3.60% +15.61%

Key Findings

  • Unprecedented Accuracy Gains: 14 out of 21 languages achieved a fine-tuned CER of 0.00%, meaning perfect character-level recognition on those samples β€” a result not seen in v1.0.

  • Hardest Cases Tackled: Languages like Amuzgo (amu) and Zapotec (zao, sample 11) started with CERs above 85–93% and were reduced to zero after fine-tuning, representing improvements of over 85 and 93 percentage points respectively.

  • Remaining Challenges: mto (Totontepec Mixe) remains the most difficult language in the set, with a fine-tuned CER of 32.47% β€” still a 65% relative improvement over its raw baseline, but indicating further work is needed for highly complex orthographies.

  • Word-Level Leap: WER dropped from 38.59% to just 3.60% β€” a 34.98 percentage point absolute improvement, compared to only 7.81pp in v1.0, demonstrating a qualitative leap in the model's ability to reconstruct full word forms in these language families.

  • Robustness: The model continues to show high resilience against synthetic distortions applied during the data generation phase. Tachiwin (from Totonac - "Language") is dedicated to bridging the digital divide for indigenous languages of Mexico through AI technology.

  • Developed by: Tachiwin

  • License: apache-2.0

  • Finetuned from model : PaddlePaddle/PaddleOCR-VL-1.5

This paddleocr_vl model was trained 2x faster with Unsloth

Downloads last month
222
GGUF
Model size
0.5B params
Architecture
paddleocr
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for tachiwin/Tachiwin-OCR-1.5-GGUF

Dataset used to train tachiwin/Tachiwin-OCR-1.5-GGUF

Evaluation results

  • Character Error Rate (CER) on Tachiwin Multilingual OCR LLM
    self-reported
    2.030
  • Word Error Rate (WER) on Tachiwin Multilingual OCR LLM
    self-reported
    3.600
  • OCR Accuracy (1 - CER) on Tachiwin Multilingual OCR LLM
    self-reported
    97.970
  • Word Accuracy (1 - WER) on Tachiwin Multilingual OCR LLM
    self-reported
    96.400