Uploaded model

  • Developed by: Ak137
  • License: apache-2.0
  • Finetuned from model : unsloth/Qwen3.5-0.8B

This qwen3_5 model was trained 2x faster with Unsloth

๐Ÿ›๏ธ Historical Spanish OCR LoRA

LoRA adapter for Qwen3.5-0.8B fine-tuned on historical Spanish manuscript images.

Results

Split CER
Validation (baseline) 0.1414
Validation (fine-tuned) 0.0559
Test (fine-tuned) 0.0309

Training configuration

Param Value
Base model Qwen3.5-0.8B
LoRA r / ฮฑ / dropout 16 / 32 / 0.0
Learning rate 1e-4 (cosine schedule)
Epochs 5
Effective batch size 24
Max image dim 2048 px
Data augmentation True
Train / Val / Test ~2000 / 500 / 500 samples

Text normalisation applied

  1. Long-s ลฟ โ†’ s
  2. รง/ร‡ โ†’ z/Z
  3. Nasal tilde abbreviations: รฃโ†’an, รตโ†’on, แบฝโ†’en, ลฉโ†’un, ฤฉโ†’in
  4. D-with-stroke ฤ‘ โ†’ de
  5. Stress accents stripped (except รฑ)
  6. Multiple spaces collapsed

Inference

from unsloth import FastVisionModel
from PIL import Image

model, tokenizer = FastVisionModel.from_pretrained("Ak137/qwen3.5-0.8B-spanish-ocr-lora", load_in_4bit=False)
FastVisionModel.for_inference(model)

image = Image.open("manuscript.jpg")
w, h = image.size
if max(w, h) > 2048:
    scale = 2048 / max(w, h)
    image = image.resize((int(w*scale), int(h*scale)))

messages = [{{"role": "user", "content": [
    {{"type": "text", "text": "Transcribe the text in this historical Spanish manuscript image."}},
    {{"type": "image"}}
]}}]
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(image, input_text, add_special_tokens=False, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Ak137/qwen3.5-0.8B-spanish-ocr-lora

Adapter
(16)
this model