BSICLE / README.md
lterriel's picture
Update README.md
42e3771 verified
---
library_name: onnxruntime
pipeline_tag: image-classification
tags:
- onnx
- image-classification
- medieval-manuscripts
- illumination-detection
- mobilenet
- mobilevit
- glam
- iiif
- cultural-heritage
- digital-humanities
- medieval-folio
- medieval
- medieval-illuminations
- MobileNet
- MobileVit
license: apache-2.0
datasets:
- ENC-PSL/medieval-folio-illumination-bin-dataset
base_model:
- timm/mobilenetv3_small_100.lamb_in1k
- timm/mobilenetv3_large_100.ra_in1k
- timm/mobilenetv2_100.ra_in1k
- apple/mobilevitv2-1.0-imagenet1k-256
---
# BSICLE — Binary System for Illuminated Folio Classification with Lightweight Engines
BSICLE (pronounced bé-si-cle ; /be.zikl/) is a family of lightweight binary models for classify **illuminated folios in medieval manuscripts**. Theses models are developed at the [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).
These models classify manuscript pages as:
- **illuminated** (miniatures, historiated initials, decorated pages etc.)
- **non-illuminated** (plain text folio, printer marks, tables, cover, blank folios etc.)
# Use cases
Models are optimized to run **locally (CPU) or in the browser** using **edge-compatibility architecture** (MobileNet, MobileViT) and **ONNX inference** for exemple to build **IIIF filter pipelines** or to build **specialized corpora**.
Try the demo web application on [hf spaces](https://huggingface.co/spaces/ENC-PSL/Medieval-Illumination-Detector)
# Models & Results
The finetuned models available in this repository are based on following architecture:
- [MobileNetV2](https://huggingface.co/timm/mobilenetv2_100.ra_in1k)
- [MobileNetV3](https://huggingface.co/timm/mobilenetv3_small_100.lamb_in1k) (small and large version)
- [MobileViT v2](https://huggingface.co/apple/mobilevitv2-1.0-imagenet1k-256)
| Architecture | Validation Accuracy | Test Accuracy |
|--------------|--------------------:|--------------:|
| MobileNetV2 | 0.995 | 0.982 |
| MobileNetV3 Small | 0.991 | 0.968 |
| MobileNetV3 Large | 1.0 | 0.986 |
| MobileViT v2 | 0.995 | 0.977 |
> These results should be interpreted with care. Although the models reach very high scores on the current splits, the task may be partially dataset-dependent.
# Labels
| Label ID | Label |
|---------|------|
| 0 | non_illuminated |
| 1 | illuminated |
# Dataset
Training data comes from: [ENC-PSL/odil-medieval-folio-illumination-bin-dataset](https://huggingface.co/datasets/ENC-PSL/odil-medieval-folio-illumination-bin-dataset)
## Distribution of data
- illuminated
- train: 519
- dev : 112
- test : 111
- non_illuminated
- train: 519
- dev : 111
- test : 112
> **Data augmentation.** During training, data augmentation was applied to the training split only in order to improve robustness and reduce overfitting.
> The augmentation pipeline included random horizontal flips, small random rotations up to 5°, and light color jittering with brightness `0.12`, contrast `0.12`, saturation `0.08`, and hue `0.02`.
> Validation and test images were evaluated without augmentation.
# What counts as "illuminated"?
### Positive (illuminated)
Examples include:
- miniatures
- historiated initials
- decorative initials
- scientific diagrams
- maps
- decorated manuscript pages
### Negative (non-illuminated)
Examples include:
- plain text folios
- marginal decorations without images
- printer marks
- tables
- cover
- blank folios
- rubricated text without illumination
### Examples
| Illuminated | Not Illuminated |
|:-----------:|:---------------:|
| <img src="assets/illuminated/1.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/1.jpeg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/3.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/3.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/5.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/5.png" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> |
# Usage
## Python — ONNX local
```bash
pip install onnxruntime pillow numpy
```
```python
import json
import numpy as np
import onnxruntime as ort
from PIL import Image
from pathlib import Path
run = Path("./mobilenet_v3_large")
cfg = json.loads((run / "inference_config.json").read_text())
pre = json.loads((run / "preprocess.json").read_text())
img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
x = np.asarray(img).astype("float32") / 255.0
x = (x - np.array(pre["mean"])) / np.array(pre["std"])
x = x.transpose(2, 0, 1)[None].astype("float32")
sess = ort.InferenceSession(str(run / "onnx/model.onnx"))
logits = sess.run(None, {cfg["input_name"]: x})[0][0]
probs = np.exp(logits - logits.max())
probs = probs / probs.sum()
p_illu = float(probs[cfg["positive_index"]])
label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
print(label, p_illu)
```
## Python — ONNX from Hugging Face
```bash
pip install huggingface_hub onnxruntime pillow numpy
```
```python
from huggingface_hub import snapshot_download
from pathlib import Path
repo = "lterriel/medieval-illumination-bin-classifier"
run_name = "final_mobilenetv3_large"
local_dir = Path(snapshot_download(
repo_id=repo,
allow_patterns=[
f"{run_name}/onnx/model.onnx",
f"{run_name}/preprocess.json",
f"{run_name}/inference_config.json",
],
)) / run_name
```
Then use the same ONNX code as above, replacing:
```python
run = Path("./mobilenet_v3_large")
```
with:
```python
run = local_dir
```
## Python — PyTorch / non-ONNX local
```bash
pip install torch torchvision pillow numpy
```
```python
import json
import torch
import numpy as np
from PIL import Image
from pathlib import Path
from torchvision import models
run = Path("./mobilenet_v3_large")
cfg = json.loads((run / "inference_config.json").read_text())
pre = json.loads((run / "preprocess.json").read_text())
model = models.mobilenet_v3_large(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
model.load_state_dict(torch.load(run / "checkpoints/best.pt", map_location="cpu"))
model.eval()
img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
x = np.asarray(img).astype("float32") / 255.0
x = (x - np.array(pre["mean"])) / np.array(pre["std"])
x = torch.tensor(x.transpose(2, 0, 1)[None]).float()
with torch.no_grad():
logits = model(x)
probs = torch.softmax(logits, dim=1)[0]
p_illu = float(probs[cfg["positive_index"]])
label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
print(label, p_illu)
```
For another torchvision architecture, replace the model constructor:
- mobilenetV2
```
model = models.mobilenet_v2(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
```
- mobilenetV2 (small)
```
model = models.mobilenet_v3_small(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
```
## Python — PyTorch / non-ONNX from Hugging Face
```bash
pip install huggingface_hub torch torchvision pillow numpy
```
```python
from huggingface_hub import snapshot_download
from pathlib import Path
repo = "lterriel/medieval-illumination-bin-classifier"
run_name = "final_mobilenetv3_large"
run = Path(snapshot_download(
repo_id=repo,
allow_patterns=[
f"{run_name}/checkpoints/best.pt",
f"{run_name}/preprocess.json",
f"{run_name}/inference_config.json",
],
)) / run_name
```
Then use the same PyTorch code as above.
## JS (HF - ONNX)
```javascript
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
<input type="file" id="file" accept="image/*">
<pre id="out"></pre>
<script type="module">
const run = "https://huggingface.co/lterriel/medieval-illumination-bin-classifier/resolve/main/final_mobilenetv3_large";
const cfg = await fetch(`${run}/inference_config.json`).then(r => r.json());
const pre = await fetch(`${run}/preprocess.json`).then(r => r.json());
const sess = await ort.InferenceSession.create(`${run}/onnx/model.onnx`);
function softmax(a) {
const m = Math.max(...a);
const e = a.map(x => Math.exp(x - m));
const s = e.reduce((x, y) => x + y, 0);
return e.map(x => x / s);
}
async function imageToTensor(file) {
const img = new Image();
img.src = URL.createObjectURL(file);
await img.decode();
const size = pre.img_size;
const canvas = document.createElement("canvas");
canvas.width = size;
canvas.height = size;
const ctx = canvas.getContext("2d");
ctx.drawImage(img, 0, 0, size, size);
const data = ctx.getImageData(0, 0, size, size).data;
const x = new Float32Array(1 * 3 * size * size);
for (let i = 0, p = 0; i < data.length; i += 4, p++) {
x[p] = (data[i] / 255 - pre.mean[0]) / pre.std[0];
x[size * size + p] = (data[i + 1] / 255 - pre.mean[1]) / pre.std[1];
x[2 * size * size + p] = (data[i + 2] / 255 - pre.mean[2]) / pre.std[2];
}
return new ort.Tensor("float32", x, [1, 3, size, size]);
}
document.querySelector("#file").onchange = async (e) => {
const tensor = await imageToTensor(e.target.files[0]);
const res = await sess.run({ [cfg.input_name]: tensor });
const logits = Array.from(res[cfg.output_name].data);
const probs = softmax(logits);
const pIllu = probs[cfg.positive_index];
const label = pIllu >= cfg.threshold ? cfg.positive_label : "non_illumination";
document.querySelector("#out").textContent = JSON.stringify({
label,
p_illumination: pIllu,
probs
}, null, 2);
};
</script>
```
# Training tools
All models are finetuned with img-clf-framework, a training framework for binary image classification pipelines. Check the [training repository here]()
# Citation
If you use these models in your research, please cite:
```
@software{terriel_bsicle_2026,
AUTHOR = {Terriel, Lucas and Jolivet, Vincent},
TITLE = {{BSICLE}: Binary System for Illuminated Folio Classification with Lightweight Engines},
YEAR = {2026},
PUBLISHER = {Hugging Face},
INSTITUTION = {{École nationale des chartes -- PSL}},
URL = {https://huggingface.co/ENC-PSL/medieval-illumination-bin-classifier},
NOTE = {Family of lightweight binary image classification models for detecting illuminated folios in medieval manuscripts, developed in the context of the O.D.I.L. project},
LICENSE = {apache-2.0},
VERSION = {0.0.1}
}
```
# Funding
<div style="display: flex; align-items: center; justify-content: center; gap: 20px; max-width: 800px; margin: 0 auto;">
<img src="./assets/odil-logo.png" width="180" alt="Logo ODIL" style="flex: 0 0 auto;">
<p style="text-align: justify; margin: 0;">
These models were developed at
<a href="https://www.chartes.psl.eu/" target="_blank" rel="noopener">
École nationale des chartes – PSL
</a>
in the context of the
<a href="https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations" target="_blank" rel="noopener">
O.D.I.L. project
</a>.
</p>
</div>