README.md · ENC-PSL/BSICLE at main

File size: 11,956 Bytes

---
library_name: onnxruntime
pipeline_tag: image-classification
tags:
- onnx
- image-classification
- medieval-manuscripts
- illumination-detection
- mobilenet
- mobilevit
- glam
- iiif
- cultural-heritage
- digital-humanities
- medieval-folio
- medieval
- medieval-illuminations
- MobileNet
- MobileVit
license: apache-2.0
datasets:
- ENC-PSL/medieval-folio-illumination-bin-dataset
base_model:
- timm/mobilenetv3_small_100.lamb_in1k
- timm/mobilenetv3_large_100.ra_in1k
- timm/mobilenetv2_100.ra_in1k
- apple/mobilevitv2-1.0-imagenet1k-256
---

# BSICLE — Binary System for Illuminated Folio Classification with Lightweight Engines

BSICLE  (pronounced bé-si-cle ; /be.zikl/) is a family of lightweight binary models for classify **illuminated folios in medieval manuscripts**. Theses models are developed at the [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).

These models classify manuscript pages as:

- **illuminated** (miniatures, historiated initials, decorated pages etc.)
- **non-illuminated** (plain text folio, printer marks, tables, cover, blank folios etc.)

# Use cases 

Models are optimized to run **locally (CPU) or in the browser** using **edge-compatibility architecture** (MobileNet, MobileViT) and **ONNX inference** for exemple to build **IIIF filter pipelines** or to build **specialized corpora**.

Try the demo web application on [hf spaces](https://huggingface.co/spaces/ENC-PSL/Medieval-Illumination-Detector)

# Models & Results 

The finetuned models available in this repository are based on following architecture: 

- [MobileNetV2](https://huggingface.co/timm/mobilenetv2_100.ra_in1k)
- [MobileNetV3](https://huggingface.co/timm/mobilenetv3_small_100.lamb_in1k) (small and large version)
- [MobileViT v2](https://huggingface.co/apple/mobilevitv2-1.0-imagenet1k-256)

| Architecture | Validation Accuracy | Test Accuracy |
|--------------|--------------------:|--------------:|
| MobileNetV2 | 0.995 | 0.982 |
| MobileNetV3 Small | 0.991 | 0.968 |
| MobileNetV3 Large | 1.0 | 0.986 |
| MobileViT v2 | 0.995 | 0.977 |

> These results should be interpreted with care. Although the models reach very high scores on the current splits, the task may be partially dataset-dependent.

# Labels

| Label ID | Label |
|---------|------|
| 0 | non_illuminated |
| 1 | illuminated |

# Dataset

Training data comes from: [ENC-PSL/odil-medieval-folio-illumination-bin-dataset](https://huggingface.co/datasets/ENC-PSL/odil-medieval-folio-illumination-bin-dataset)

## Distribution of data

- illuminated
  - train: 519
  - dev  : 112
  - test : 111
- non_illuminated
  - train: 519
  - dev  : 111
  - test : 112

> **Data augmentation.** During training, data augmentation was applied to the training split only in order to improve robustness and reduce overfitting.
>  The augmentation pipeline included random horizontal flips, small random rotations up to 5°, and light color jittering with brightness `0.12`, contrast `0.12`, saturation `0.08`, and hue `0.02`.
> Validation and test images were evaluated without augmentation.

# What counts as "illuminated"?

### Positive (illuminated)

Examples include:

- miniatures
- historiated initials
- decorative initials
- scientific diagrams
- maps
- decorated manuscript pages


### Negative (non-illuminated)

Examples include:

- plain text folios
- marginal decorations without images
- printer marks
- tables
- cover
- blank folios
- rubricated text without illumination

### Examples

| Illuminated | Not Illuminated |
|:-----------:|:---------------:|
| <img src="assets/illuminated/1.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/1.jpeg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/2.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/3.png" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/3.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/4.jpg" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/5.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/5.png" width="150" height="150" style="object-fit:cover;"> |
| <img src="assets/illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> | <img src="assets/not_illuminated/6.jpg" width="150" height="150" style="object-fit:cover;"> |
# Usage

## Python — ONNX local

```bash
pip install onnxruntime pillow numpy
```

```python
import json
import numpy as np
import onnxruntime as ort
from PIL import Image
from pathlib import Path

run = Path("./mobilenet_v3_large")

cfg = json.loads((run / "inference_config.json").read_text())
pre = json.loads((run / "preprocess.json").read_text())

img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
x = np.asarray(img).astype("float32") / 255.0
x = (x - np.array(pre["mean"])) / np.array(pre["std"])
x = x.transpose(2, 0, 1)[None].astype("float32")

sess = ort.InferenceSession(str(run / "onnx/model.onnx"))
logits = sess.run(None, {cfg["input_name"]: x})[0][0]

probs = np.exp(logits - logits.max())
probs = probs / probs.sum()

p_illu = float(probs[cfg["positive_index"]])
label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"

print(label, p_illu)
```

## Python — ONNX from Hugging Face

```bash
pip install huggingface_hub onnxruntime pillow numpy
```

```python
from huggingface_hub import snapshot_download
from pathlib import Path

repo = "lterriel/medieval-illumination-bin-classifier"
run_name = "final_mobilenetv3_large"
local_dir = Path(snapshot_download(
    repo_id=repo,
    allow_patterns=[
        f"{run_name}/onnx/model.onnx",
        f"{run_name}/preprocess.json",
        f"{run_name}/inference_config.json",
    ],
)) / run_name
```

Then use the same ONNX code as above, replacing:

```python
run = Path("./mobilenet_v3_large")
```

with:

```python
run = local_dir
```

## Python — PyTorch / non-ONNX local

```bash
pip install torch torchvision pillow numpy
```

```python
import json
import torch
import numpy as np
from PIL import Image
from pathlib import Path
from torchvision import models

run = Path("./mobilenet_v3_large")

cfg = json.loads((run / "inference_config.json").read_text())
pre = json.loads((run / "preprocess.json").read_text())

model = models.mobilenet_v3_large(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
model.load_state_dict(torch.load(run / "checkpoints/best.pt", map_location="cpu"))
model.eval()

img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
x = np.asarray(img).astype("float32") / 255.0
x = (x - np.array(pre["mean"])) / np.array(pre["std"])
x = torch.tensor(x.transpose(2, 0, 1)[None]).float()

with torch.no_grad():
    logits = model(x)
    probs = torch.softmax(logits, dim=1)[0]

p_illu = float(probs[cfg["positive_index"]])
label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"

print(label, p_illu)
```

For another torchvision architecture, replace the model constructor:

- mobilenetV2

```
model = models.mobilenet_v2(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
```

- mobilenetV2 (small)

```
model = models.mobilenet_v3_small(weights=None)
model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
```

## Python — PyTorch / non-ONNX from Hugging Face

```bash
pip install huggingface_hub torch torchvision pillow numpy
```

```python
from huggingface_hub import snapshot_download
from pathlib import Path

repo = "lterriel/medieval-illumination-bin-classifier"
run_name = "final_mobilenetv3_large"

run = Path(snapshot_download(
    repo_id=repo,
    allow_patterns=[
        f"{run_name}/checkpoints/best.pt",
        f"{run_name}/preprocess.json",
        f"{run_name}/inference_config.json",
    ],
)) / run_name
```

Then use the same PyTorch code as above.

## JS (HF - ONNX)

```javascript
<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
<input type="file" id="file" accept="image/*">
<pre id="out"></pre>

<script type="module">
const run = "https://huggingface.co/lterriel/medieval-illumination-bin-classifier/resolve/main/final_mobilenetv3_large";

const cfg = await fetch(`${run}/inference_config.json`).then(r => r.json());
const pre = await fetch(`${run}/preprocess.json`).then(r => r.json());
const sess = await ort.InferenceSession.create(`${run}/onnx/model.onnx`);

function softmax(a) {
  const m = Math.max(...a);
  const e = a.map(x => Math.exp(x - m));
  const s = e.reduce((x, y) => x + y, 0);
  return e.map(x => x / s);
}

async function imageToTensor(file) {
  const img = new Image();
  img.src = URL.createObjectURL(file);
  await img.decode();

  const size = pre.img_size;
  const canvas = document.createElement("canvas");
  canvas.width = size;
  canvas.height = size;

  const ctx = canvas.getContext("2d");
  ctx.drawImage(img, 0, 0, size, size);

  const data = ctx.getImageData(0, 0, size, size).data;
  const x = new Float32Array(1 * 3 * size * size);

  for (let i = 0, p = 0; i < data.length; i += 4, p++) {
    x[p] = (data[i] / 255 - pre.mean[0]) / pre.std[0];
    x[size * size + p] = (data[i + 1] / 255 - pre.mean[1]) / pre.std[1];
    x[2 * size * size + p] = (data[i + 2] / 255 - pre.mean[2]) / pre.std[2];
  }

  return new ort.Tensor("float32", x, [1, 3, size, size]);
}

document.querySelector("#file").onchange = async (e) => {
  const tensor = await imageToTensor(e.target.files[0]);
  const res = await sess.run({ [cfg.input_name]: tensor });

  const logits = Array.from(res[cfg.output_name].data);
  const probs = softmax(logits);

  const pIllu = probs[cfg.positive_index];
  const label = pIllu >= cfg.threshold ? cfg.positive_label : "non_illumination";

  document.querySelector("#out").textContent = JSON.stringify({
    label,
    p_illumination: pIllu,
    probs
  }, null, 2);
};
</script>
```

# Training tools 

All models are finetuned with img-clf-framework, a training framework for binary image classification pipelines. Check the [training repository here]()

# Citation

If you use these models in your research, please cite:

```
@software{terriel_bsicle_2026,
  AUTHOR       = {Terriel, Lucas and Jolivet, Vincent},
  TITLE        = {{BSICLE}: Binary System for Illuminated Folio Classification with Lightweight Engines},
  YEAR         = {2026},
  PUBLISHER    = {Hugging Face},
  INSTITUTION  = {{École nationale des chartes -- PSL}},
  URL          = {https://huggingface.co/ENC-PSL/medieval-illumination-bin-classifier},
  NOTE         = {Family of lightweight binary image classification models for detecting illuminated folios in medieval manuscripts, developed in the context of the O.D.I.L. project},
  LICENSE      = {apache-2.0},
  VERSION      = {0.0.1}
}
```

# Funding

<div style="display: flex; align-items: center; justify-content: center; gap: 20px; max-width: 800px; margin: 0 auto;">
  <img src="./assets/odil-logo.png" width="180" alt="Logo ODIL" style="flex: 0 0 auto;">

  <p style="text-align: justify; margin: 0;">
    These models were developed at
    <a href="https://www.chartes.psl.eu/" target="_blank" rel="noopener">
      École nationale des chartes – PSL
    </a>
    in the context of the
    <a href="https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations" target="_blank" rel="noopener">
      O.D.I.L. project
    </a>.
  </p>
</div>