Uploaded model
- Developed by: Ak137
- License: apache-2.0
- Finetuned from model : unsloth/Qwen3.5-0.8B
This qwen3_5 model was trained 2x faster with Unsloth
๐๏ธ Historical Spanish OCR LoRA
LoRA adapter for Qwen3.5-0.8B fine-tuned on historical Spanish manuscript images.
Results
| Split | CER |
|---|---|
| Validation (baseline) | 0.1414 |
| Validation (fine-tuned) | 0.0559 |
| Test (fine-tuned) | 0.0309 |
Training configuration
| Param | Value |
|---|---|
| Base model | Qwen3.5-0.8B |
| LoRA r / ฮฑ / dropout | 16 / 32 / 0.0 |
| Learning rate | 1e-4 (cosine schedule) |
| Epochs | 5 |
| Effective batch size | 24 |
| Max image dim | 2048 px |
| Data augmentation | True |
| Train / Val / Test | ~2000 / 500 / 500 samples |
Text normalisation applied
- Long-s
ลฟโs รง/รโz/Z- Nasal tilde abbreviations:
รฃโan,รตโon,แบฝโen,ลฉโun,ฤฉโin - D-with-stroke
ฤโde - Stress accents stripped (except
รฑ) - Multiple spaces collapsed
Inference
from unsloth import FastVisionModel
from PIL import Image
model, tokenizer = FastVisionModel.from_pretrained("Ak137/qwen3.5-0.8B-spanish-ocr-lora", load_in_4bit=False)
FastVisionModel.for_inference(model)
image = Image.open("manuscript.jpg")
w, h = image.size
if max(w, h) > 2048:
scale = 2048 / max(w, h)
image = image.resize((int(w*scale), int(h*scale)))
messages = [{{"role": "user", "content": [
{{"type": "text", "text": "Transcribe the text in this historical Spanish manuscript image."}},
{{"type": "image"}}
]}}]
input_text = tokenizer.apply_chat_template(messages, add_generation_prompt=True)
inputs = tokenizer(image, input_text, add_special_tokens=False, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support
