Croatian Glagolitic HTR Model (Puigcerver CRNN)
A Handwritten Text Recognition (HTR) model for 14th–15th century Croatian Glagolitic manuscripts, based on the CNN + BiLSTM + CTC architecture introduced in Puigcerver (2017) and used as the backbone of PyLaia and Transkribus.
Important: This model reads Glagolitic handwriting and outputs Latin script transliteration, not Glagolitic Unicode characters. It handles ligatures and resolves the most common abbreviations.
This is a clean-room PyTorch reimplementation of that published architecture (PyLaia-inspired). It does not use the PyLaia Python package and is not loadable by it — training and inference run via plain PyTorch (see Usage below).
Model Details
- Architecture: CNN encoder [12, 24, 48, 48 filters] + 3-layer Bidirectional LSTM (256 units) + CTC decoder (Puigcerver 2017)
- Input: Grayscale line images, normalized to 128 px height with aspect ratio preserved
- Output: Latin script transliteration of Croatian Glagolitic text
- Vocabulary: 76 symbols (
symbols.txt) - Framework: Pure PyTorch — clean-room reimplementation of the Puigcerver (2017) architecture (PyLaia-inspired); the PyLaia package is not required
Performance
| Metric | Value |
|---|---|
| Validation CER | 5.33% |
| Training epochs | 42 |
| Training lines | 23,203 |
| Validation lines | 1,361 |
Training Data
Trained on Glagolitic handwriting images transcribed and exported from Transkribus (see the corresponding Transkribus model page). The dataset covers 14th–15th century Croatian Glagolitic handwriting.
Source manuscripts:
- Cod. Vind. Slav. 3 (Breviary of Vid of Omišalj)
- II. beramski brevijar
Ground truth data was kindly provided by Sanja Zubčić (Rijeka) and Jagoda and Guido Kappel (Vienna). Model trained and curated by Achim Rabus (Slavic Department, University of Freiburg). The Transkribus collection comprises 531 training pages and 31 validation pages (~31,035 lines in total). Our CRNN-CTC model was trained on 23,203 lines (training) and 1,361 lines (validation) from this export.
Usage
Requirements
pip install torch torchvision pillow
Inference
Download best_model.pt, symbols.txt, and model_config.json from this repository,
then use the inference script from polyscriptor:
from inference_pylaia_native import PyLaiaInference
from PIL import Image
# Load model
model = PyLaiaInference(
checkpoint_path="best_model.pt",
syms_path="symbols.txt"
)
# Transcribe a line image
image = Image.open("line_image.jpg")
text = model.transcribe(image)
print(text) # Output: Latin script transliteration
Note: Input should be a single text line image, not a full page. Preprocessing (grayscale conversion, height normalization, aspect ratio preservation) is handled automatically by
inference_pylaia_native.py.
For full-page inference with automatic line segmentation, use batch_processing.py:
python batch_processing.py \
--engine crnn-ctc \
--model-path best_model.pt \
--input-folder images/ \
--output-folder output/
GUI Usage
polyscriptor also ships graphical interfaces that handle full-page processing without requiring pre-segmented line images:
Interactive single-page GUI — loads raw page images, performs automatic line segmentation, and can export results as PAGE XML:
python transcription_gui_plugin.py
Batch processing GUI — processes entire folders; auto-detects existing PAGE XML files (e.g. from Transkribus) and uses them for segmentation when available:
python polyscriptor_batch_gui.py
Intended Use
- Transcription of 14th–15th century Croatian Glagolitic manuscripts
- Digital humanities research on medieval Croatian texts
Limitations
- Trained on two manuscript sources (Cod. Vind. Slav. 3 and II. beramski brevijar); may underperform on other hands or periods
- Output is Latin script transliteration, not Glagolitic Unicode
- Full-page segmentation quality depends on the segmentation method used upstream
Citation
If you use this model in your research, please cite the architecture paper, the publication describing the training data and recognition system, and this model:
@article{rabus2022glagolitic,
title = {Handwritten Text Recognition for Croatian Glagolitic},
author = {Rabus, Achim},
journal = {Slovo: časopis Staroslavenskoga instituta u Zagrebu},
volume = {72},
pages = {181--192},
year = {2022},
doi = {10.31745/s.72.5},
url = {https://doi.org/10.31745/s.72.5}
}
@article{puigcerver2017multidimensional,
title = {Are Multidimensional Recurrent Layers Really Necessary for Handwritten Text Recognition?},
author = {Puigcerver, Joan},
journal = {Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition (ICDAR)},
year = {2017},
url = {https://www.jpuigcerver.net/pubs/jpuigcerver_icdar2017.pdf}
}
@misc{rabus2026polyscriptor,
title = {Polyscriptor: Multi-Engine HTR Training \& Comparison Tool},
author = {Rabus, Achim},
year = {2026},
url = {https://github.com/achimrabus/polyscriptor}
}