| --- |
| library_name: onnxruntime |
| pipeline_tag: image-classification |
| tags: |
| - onnx |
| - image-classification |
| - medieval-manuscripts |
| - illumination-detection |
| - mobilenet |
| - mobilevit |
| - glam |
| - iiif |
| - cultural-heritage |
| - digital-humanities |
| - medieval-folio |
| - medieval |
| - medieval-illuminations |
| - MobileNet |
| - MobileVit |
| license: apache-2.0 |
| datasets: |
| - ENC-PSL/medieval-folio-illumination-bin-dataset |
| base_model: |
| - timm/mobilenetv3_small_100.lamb_in1k |
| - timm/mobilenetv3_large_100.ra_in1k |
| - timm/mobilenetv2_100.ra_in1k |
| - apple/mobilevitv2-1.0-imagenet1k-256 |
| --- |
| |
| # BSICLE — Binary System for Illuminated Folio Classification with Lightweight Engines |
|
|
| BSICLE (pronounced bé-si-cle ; /be.zikl/) is a family of lightweight binary models for classify **illuminated folios in medieval manuscripts**. Theses models are developed at the [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations). |
|
|
| These models classify manuscript pages as: |
|
|
| - **illuminated** (miniatures, historiated initials, decorated pages etc.) |
| - **non-illuminated** (plain text folio, printer marks, tables, cover, blank folios etc.) |
|
|
| # Use cases |
|
|
| Models are optimized to run **locally (CPU) or in the browser** using **edge-compatibility architecture** (MobileNet, MobileViT) and **ONNX inference** for exemple to build **IIIF filter pipelines** or to build **specialized corpora**. |
|
|
| Try the demo web application on [hf spaces](https://huggingface.co/spaces/ENC-PSL/Medieval-Illumination-Detector) |
|
|
| # Models & Results |
|
|
| The finetuned models available in this repository are based on following architecture: |
|
|
| - [MobileNetV2](https://huggingface.co/timm/mobilenetv2_100.ra_in1k) |
| - [MobileNetV3](https://huggingface.co/timm/mobilenetv3_small_100.lamb_in1k) (small and large version) |
| - [MobileViT v2](https://huggingface.co/apple/mobilevitv2-1.0-imagenet1k-256) |
|
|
| | Architecture | Validation Accuracy | Test Accuracy | |
| |--------------|--------------------:|--------------:| |
| | MobileNetV2 | 0.995 | 0.982 | |
| | MobileNetV3 Small | 0.991 | 0.968 | |
| | MobileNetV3 Large | 1.0 | 0.986 | |
| | MobileViT v2 | 0.995 | 0.977 | |
|
|
| > These results should be interpreted with care. Although the models reach very high scores on the current splits, the task may be partially dataset-dependent. |
|
|
| # Labels |
|
|
| | Label ID | Label | |
| |---------|------| |
| | 0 | non_illuminated | |
| | 1 | illuminated | |
| |
| # Dataset |
| |
| Training data comes from: [ENC-PSL/odil-medieval-folio-illumination-bin-dataset](https://huggingface.co/datasets/ENC-PSL/odil-medieval-folio-illumination-bin-dataset) |
| |
| ## Distribution of data |
| |
| - illuminated |
| - train: 519 |
| - dev : 112 |
| - test : 111 |
| - non_illuminated |
| - train: 519 |
| - dev : 111 |
| - test : 112 |
|
|
| > **Data augmentation.** During training, data augmentation was applied to the training split only in order to improve robustness and reduce overfitting. |
| > The augmentation pipeline included random horizontal flips, small random rotations up to 5°, and light color jittering with brightness `0.12`, contrast `0.12`, saturation `0.08`, and hue `0.02`. |
| > Validation and test images were evaluated without augmentation. |
|
|
| # What counts as "illuminated"? |
|
|
| ### Positive (illuminated) |
|
|
| Examples include: |
|
|
| - miniatures |
| - historiated initials |
| - decorative initials |
| - scientific diagrams |
| - maps |
| - decorated manuscript pages |
|
|
|
|
| ### Negative (non-illuminated) |
|
|
| Examples include: |
|
|
| - plain text folios |
| - marginal decorations without images |
| - printer marks |
| - tables |
| - cover |
| - blank folios |
| - rubricated text without illumination |
|
|
| ### Examples |
|
|
| | Illuminated | Not Illuminated | |
| |:-----------:|:---------------:| |
| | <img src="assets/illuminated/1.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/1.jpeg" width="150" height="150" style="object-fit:cover;"> | |
| | <img src="assets/illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> | |
| | <img src="assets/illuminated/3.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/3.jpg" width="150" height="150" style="object-fit:cover;"> | |
| | <img src="assets/illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> | |
| | <img src="assets/illuminated/5.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/5.png" width="150" height="150" style="object-fit:cover;"> | |
| | <img src="assets/illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> | |
| # Usage |
|
|
| ## Python — ONNX local |
|
|
| ```bash |
| pip install onnxruntime pillow numpy |
| ``` |
|
|
| ```python |
| import json |
| import numpy as np |
| import onnxruntime as ort |
| from PIL import Image |
| from pathlib import Path |
| |
| run = Path("./mobilenet_v3_large") |
| |
| cfg = json.loads((run / "inference_config.json").read_text()) |
| pre = json.loads((run / "preprocess.json").read_text()) |
| |
| img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"])) |
| x = np.asarray(img).astype("float32") / 255.0 |
| x = (x - np.array(pre["mean"])) / np.array(pre["std"]) |
| x = x.transpose(2, 0, 1)[None].astype("float32") |
| |
| sess = ort.InferenceSession(str(run / "onnx/model.onnx")) |
| logits = sess.run(None, {cfg["input_name"]: x})[0][0] |
| |
| probs = np.exp(logits - logits.max()) |
| probs = probs / probs.sum() |
| |
| p_illu = float(probs[cfg["positive_index"]]) |
| label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination" |
| |
| print(label, p_illu) |
| ``` |
|
|
| ## Python — ONNX from Hugging Face |
|
|
| ```bash |
| pip install huggingface_hub onnxruntime pillow numpy |
| ``` |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| from pathlib import Path |
| |
| repo = "lterriel/medieval-illumination-bin-classifier" |
| run_name = "final_mobilenetv3_large" |
| local_dir = Path(snapshot_download( |
| repo_id=repo, |
| allow_patterns=[ |
| f"{run_name}/onnx/model.onnx", |
| f"{run_name}/preprocess.json", |
| f"{run_name}/inference_config.json", |
| ], |
| )) / run_name |
| ``` |
|
|
| Then use the same ONNX code as above, replacing: |
|
|
| ```python |
| run = Path("./mobilenet_v3_large") |
| ``` |
|
|
| with: |
|
|
| ```python |
| run = local_dir |
| ``` |
|
|
| ## Python — PyTorch / non-ONNX local |
|
|
| ```bash |
| pip install torch torchvision pillow numpy |
| ``` |
|
|
| ```python |
| import json |
| import torch |
| import numpy as np |
| from PIL import Image |
| from pathlib import Path |
| from torchvision import models |
| |
| run = Path("./mobilenet_v3_large") |
| |
| cfg = json.loads((run / "inference_config.json").read_text()) |
| pre = json.loads((run / "preprocess.json").read_text()) |
| |
| model = models.mobilenet_v3_large(weights=None) |
| model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2) |
| model.load_state_dict(torch.load(run / "checkpoints/best.pt", map_location="cpu")) |
| model.eval() |
| |
| img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"])) |
| x = np.asarray(img).astype("float32") / 255.0 |
| x = (x - np.array(pre["mean"])) / np.array(pre["std"]) |
| x = torch.tensor(x.transpose(2, 0, 1)[None]).float() |
| |
| with torch.no_grad(): |
| logits = model(x) |
| probs = torch.softmax(logits, dim=1)[0] |
| |
| p_illu = float(probs[cfg["positive_index"]]) |
| label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination" |
| |
| print(label, p_illu) |
| ``` |
|
|
| For another torchvision architecture, replace the model constructor: |
|
|
| - mobilenetV2 |
|
|
| ``` |
| model = models.mobilenet_v2(weights=None) |
| model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2) |
| ``` |
|
|
| - mobilenetV2 (small) |
|
|
| ``` |
| model = models.mobilenet_v3_small(weights=None) |
| model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2) |
| ``` |
|
|
| ## Python — PyTorch / non-ONNX from Hugging Face |
|
|
| ```bash |
| pip install huggingface_hub torch torchvision pillow numpy |
| ``` |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| from pathlib import Path |
| |
| repo = "lterriel/medieval-illumination-bin-classifier" |
| run_name = "final_mobilenetv3_large" |
| |
| run = Path(snapshot_download( |
| repo_id=repo, |
| allow_patterns=[ |
| f"{run_name}/checkpoints/best.pt", |
| f"{run_name}/preprocess.json", |
| f"{run_name}/inference_config.json", |
| ], |
| )) / run_name |
| ``` |
|
|
| Then use the same PyTorch code as above. |
|
|
| ## JS (HF - ONNX) |
|
|
| ```javascript |
| <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script> |
| <input type="file" id="file" accept="image/*"> |
| <pre id="out"></pre> |
| |
| <script type="module"> |
| const run = "https://huggingface.co/lterriel/medieval-illumination-bin-classifier/resolve/main/final_mobilenetv3_large"; |
| |
| const cfg = await fetch(`${run}/inference_config.json`).then(r => r.json()); |
| const pre = await fetch(`${run}/preprocess.json`).then(r => r.json()); |
| const sess = await ort.InferenceSession.create(`${run}/onnx/model.onnx`); |
| |
| function softmax(a) { |
| const m = Math.max(...a); |
| const e = a.map(x => Math.exp(x - m)); |
| const s = e.reduce((x, y) => x + y, 0); |
| return e.map(x => x / s); |
| } |
| |
| async function imageToTensor(file) { |
| const img = new Image(); |
| img.src = URL.createObjectURL(file); |
| await img.decode(); |
| |
| const size = pre.img_size; |
| const canvas = document.createElement("canvas"); |
| canvas.width = size; |
| canvas.height = size; |
| |
| const ctx = canvas.getContext("2d"); |
| ctx.drawImage(img, 0, 0, size, size); |
| |
| const data = ctx.getImageData(0, 0, size, size).data; |
| const x = new Float32Array(1 * 3 * size * size); |
| |
| for (let i = 0, p = 0; i < data.length; i += 4, p++) { |
| x[p] = (data[i] / 255 - pre.mean[0]) / pre.std[0]; |
| x[size * size + p] = (data[i + 1] / 255 - pre.mean[1]) / pre.std[1]; |
| x[2 * size * size + p] = (data[i + 2] / 255 - pre.mean[2]) / pre.std[2]; |
| } |
| |
| return new ort.Tensor("float32", x, [1, 3, size, size]); |
| } |
| |
| document.querySelector("#file").onchange = async (e) => { |
| const tensor = await imageToTensor(e.target.files[0]); |
| const res = await sess.run({ [cfg.input_name]: tensor }); |
| |
| const logits = Array.from(res[cfg.output_name].data); |
| const probs = softmax(logits); |
| |
| const pIllu = probs[cfg.positive_index]; |
| const label = pIllu >= cfg.threshold ? cfg.positive_label : "non_illumination"; |
| |
| document.querySelector("#out").textContent = JSON.stringify({ |
| label, |
| p_illumination: pIllu, |
| probs |
| }, null, 2); |
| }; |
| </script> |
| ``` |
|
|
| # Training tools |
|
|
| All models are finetuned with img-clf-framework, a training framework for binary image classification pipelines. Check the [training repository here]() |
|
|
| # Citation |
|
|
| If you use these models in your research, please cite: |
|
|
| ``` |
| @software{terriel_bsicle_2026, |
| AUTHOR = {Terriel, Lucas and Jolivet, Vincent}, |
| TITLE = {{BSICLE}: Binary System for Illuminated Folio Classification with Lightweight Engines}, |
| YEAR = {2026}, |
| PUBLISHER = {Hugging Face}, |
| INSTITUTION = {{École nationale des chartes -- PSL}}, |
| URL = {https://huggingface.co/ENC-PSL/medieval-illumination-bin-classifier}, |
| NOTE = {Family of lightweight binary image classification models for detecting illuminated folios in medieval manuscripts, developed in the context of the O.D.I.L. project}, |
| LICENSE = {apache-2.0}, |
| VERSION = {0.0.1} |
| } |
| ``` |
|
|
| # Funding |
|
|
| <div style="display: flex; align-items: center; justify-content: center; gap: 20px; max-width: 800px; margin: 0 auto;"> |
| <img src="./assets/odil-logo.png" width="180" alt="Logo ODIL" style="flex: 0 0 auto;"> |
|
|
| <p style="text-align: justify; margin: 0;"> |
| These models were developed at |
| <a href="https://www.chartes.psl.eu/" target="_blank" rel="noopener"> |
| École nationale des chartes – PSL |
| </a> |
| in the context of the |
| <a href="https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations" target="_blank" rel="noopener"> |
| O.D.I.L. project |
| </a>. |
| </p> |
| </div> |