Update README.md

Browse files

Files changed (1) hide show

README.md +363 -0

README.md CHANGED Viewed

@@ -1,3 +1,366 @@
 ---
 license: apache-2.0
 ---

 ---
+library_name: onnxruntime
+pipeline_tag: image-classification
+tags:
+- onnx
+- image-classification
+- medieval-manuscripts
+- illumination-detection
+- mobilenet
+- mobilevit
+- glam
+- iiif
+- cultural-heritage
+- digital-humanities
+- medieval-folio
+- medieval
+- medieval-illuminations
+- MobileNet
+- MobileVit
 license: apache-2.0
+datasets:
+- ENC-PSL/medieval-folio-illumination-bin-dataset
+base_model:
+- timm/mobilenetv3_small_100.lamb_in1k
+- timm/mobilenetv3_large_100.ra_in1k
+- timm/mobilenetv2_100.ra_in1k
+- apple/mobilevitv2-1.0-imagenet1k-256
 ---
+# BSICLE — Binary System for Illuminated Folio Classification with Lightweight Engines
+BSICLE  (pronounced bé-si-cle ; /be.zikl/) is a family of lightweight binary models for classify **illuminated folios in medieval manuscripts**. Theses models are developed at the [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).
+These models classify manuscript pages as:
+- **illuminated** (miniatures, historiated initials, decorated pages etc.)
+- **non-illuminated** (plain text folio, printer marks, tables, cover, blank folios etc.)
+# Use cases
+Models are optimized to run **locally (CPU) or in the browser** using **edge-compatibility architecture** (MobileNet, MobileViT) and **ONNX inference** for exemple to build **IIIF filter pipelines** or to build **specialized corpora**.
+:octocat: For an exemple of use check the demo web application [on github]() or [on hf spaces]()
+# Models & Results
+The finetuned models available in this repository are based on following architecture:
+- [MobileNetV2](timm/mobilenetv2_100.ra_in1k)
+- [MobileNetV3](timm/mobilenetv3_small_100.lamb_in1k) (small and large version)
+- [MobileViT v2](apple/mobilevitv2-1.0-imagenet1k-256
+| Architecture | Validation Accuracy | Test Accuracy | F1 | Precision | Recall | AUC |
+|--------------|--------------------:|--------------:|---:|----------:|-------:|----:|
+| MobileNetV2 | 1.0000 | 0.9776 | 0.9770 | 1.0000 | 0.9550 | 0.9993 |
+| MobileNetV3 Small | 1.0000 | 0.9731 | 0.9727 | 0.9817 | 0.9640 | 0.9984 |
+| MobileNetV3 Large | 1.0000 | 0.9865 | 0.9864 | 0.9909 | 0.9820 | 0.9992 |
+| MobileViT v2 | 0.9955 | 0.9776 | 0.9770 | 1.0000 | 0.9550 | 0.9992 |
+> This repository contains two model variants: models prefix with "final_" are finetuned with no test set
+> These results should be interpreted with care. Although the models reach very high scores on the current splits, the task may be partially dataset-dependent.
+# Labels
+| Label ID | Label |
+|---------|------|
+| 0 | non_illuminated |
+| 1 | illuminated |
+# Dataset
+Training data comes from: [ENC-PSL/odil-medieval-folio-illumination-bin-dataset](https://huggingface.co/datasets/ENC-PSL/odil-medieval-folio-illumination-bin-dataset)
+## Distribution of data
+- illuminated
+  - train: 519
+  - dev  : 112
+  - test : 111
+- non_illuminated
+  - train: 519
+  - dev  : 111
+  - test : 112
+# What counts as "illuminated"?
+### Positive (illuminated)
+Examples include:
+- miniatures
+- historiated initials
+- decorative initials
+- scientific diagrams
+- maps
+- decorated manuscript pages
+### Negative (non-illuminated)
+Examples include:
+- plain text folios
+- marginal decorations without images
+- printer marks
+- tables
+- cover
+- blank folios
+- rubricated text without illumination
+### Examples
+| Illuminated | Not Illuminated |
+|:-----------:|:---------------:|
+| ![illuminated](illuminated_0.jpg) | ![not illuminated](not_illuminated_0.jpg) |
+|  |  |
+# Usage
+## Python — ONNX local
+```bash
+pip install onnxruntime pillow numpy
+```
+```python
+import json
+import numpy as np
+import onnxruntime as ort
+from PIL import Image
+from pathlib import Path
+run = Path("./mobilenet_v3_large")
+cfg = json.loads((run / "inference_config.json").read_text())
+pre = json.loads((run / "preprocess.json").read_text())
+img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
+x = np.asarray(img).astype("float32") / 255.0
+x = (x - np.array(pre["mean"])) / np.array(pre["std"])
+x = x.transpose(2, 0, 1)[None].astype("float32")
+sess = ort.InferenceSession(str(run / "onnx/model.onnx"))
+logits = sess.run(None, {cfg["input_name"]: x})[0][0]
+probs = np.exp(logits - logits.max())
+probs = probs / probs.sum()
+p_illu = float(probs[cfg["positive_index"]])
+label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
+print(label, p_illu)
+```
+## Python — ONNX from Hugging Face
+```bash
+pip install huggingface_hub onnxruntime pillow numpy
+```
+```python
+from huggingface_hub import snapshot_download
+from pathlib import Path
+repo = "lterriel/medieval-illumination-bin-classifier"
+run_name = "final_mobilenetv3_large"
+local_dir = Path(snapshot_download(
+    repo_id=repo,
+    allow_patterns=[
+        f"{run_name}/onnx/model.onnx",
+        f"{run_name}/preprocess.json",
+        f"{run_name}/inference_config.json",
+    ],
+)) / run_name
+```
+Then use the same ONNX code as above, replacing:
+```python
+run = Path("./mobilenet_v3_large")
+```
+with:
+```python
+run = local_dir
+```
+## Python — PyTorch / non-ONNX local
+```bash
+pip install torch torchvision pillow numpy
+```
+```python
+import json
+import torch
+import numpy as np
+from PIL import Image
+from pathlib import Path
+from torchvision import models
+run = Path("./mobilenet_v3_large")
+cfg = json.loads((run / "inference_config.json").read_text())
+pre = json.loads((run / "preprocess.json").read_text())
+model = models.mobilenet_v3_large(weights=None)
+model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
+model.load_state_dict(torch.load(run / "checkpoints/best.pt", map_location="cpu"))
+model.eval()
+img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
+x = np.asarray(img).astype("float32") / 255.0
+x = (x - np.array(pre["mean"])) / np.array(pre["std"])
+x = torch.tensor(x.transpose(2, 0, 1)[None]).float()
+with torch.no_grad():
+    logits = model(x)
+    probs = torch.softmax(logits, dim=1)[0]
+p_illu = float(probs[cfg["positive_index"]])
+label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
+print(label, p_illu)
+```
+For another torchvision architecture, replace the model constructor:
+- mobilenetV2
+```
+model = models.mobilenet_v2(weights=None)
+model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
+```
+- mobilenetV2 (small)
+```
+model = models.mobilenet_v3_small(weights=None)
+model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
+```
+## Python — PyTorch / non-ONNX from Hugging Face
+```bash
+pip install huggingface_hub torch torchvision pillow numpy
+```
+```python
+from huggingface_hub import snapshot_download
+from pathlib import Path
+repo = "lterriel/medieval-illumination-bin-classifier"
+run_name = "final_mobilenetv3_large"
+run = Path(snapshot_download(
+    repo_id=repo,
+    allow_patterns=[
+        f"{run_name}/checkpoints/best.pt",
+        f"{run_name}/preprocess.json",
+        f"{run_name}/inference_config.json",
+    ],
+)) / run_name
+```
+Then use the same PyTorch code as above.
+## JS (HF - ONNX)
+```javascript
+<script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
+<input type="file" id="file" accept="image/*">
+<pre id="out"></pre>
+<script type="module">
+const run = "https://huggingface.co/lterriel/medieval-illumination-bin-classifier/resolve/main/final_mobilenetv3_large";
+const cfg = await fetch(`${run}/inference_config.json`).then(r => r.json());
+const pre = await fetch(`${run}/preprocess.json`).then(r => r.json());
+const sess = await ort.InferenceSession.create(`${run}/onnx/model.onnx`);
+function softmax(a) {
+  const m = Math.max(...a);
+  const e = a.map(x => Math.exp(x - m));
+  const s = e.reduce((x, y) => x + y, 0);
+  return e.map(x => x / s);
+}
+async function imageToTensor(file) {
+  const img = new Image();
+  img.src = URL.createObjectURL(file);
+  await img.decode();
+  const size = pre.img_size;
+  const canvas = document.createElement("canvas");
+  canvas.width = size;
+  canvas.height = size;
+  const ctx = canvas.getContext("2d");
+  ctx.drawImage(img, 0, 0, size, size);
+  const data = ctx.getImageData(0, 0, size, size).data;
+  const x = new Float32Array(1 * 3 * size * size);
+  for (let i = 0, p = 0; i < data.length; i += 4, p++) {
+    x[p] = (data[i] / 255 - pre.mean[0]) / pre.std[0];
+    x[size * size + p] = (data[i + 1] / 255 - pre.mean[1]) / pre.std[1];
+    x[2 * size * size + p] = (data[i + 2] / 255 - pre.mean[2]) / pre.std[2];
+  }
+  return new ort.Tensor("float32", x, [1, 3, size, size]);
+}
+document.querySelector("#file").onchange = async (e) => {
+  const tensor = await imageToTensor(e.target.files[0]);
+  const res = await sess.run({ [cfg.input_name]: tensor });
+  const logits = Array.from(res[cfg.output_name].data);
+  const probs = softmax(logits);
+  const pIllu = probs[cfg.positive_index];
+  const label = pIllu >= cfg.threshold ? cfg.positive_label : "non_illumination";
+  document.querySelector("#out").textContent = JSON.stringify({
+    label,
+    p_illumination: pIllu,
+    probs
+  }, null, 2);
+};
+</script>
+```
+# Training tools
+All models are finetuned with img-clf-framework, a training framework for binary image classification pipelines. Check the [training repository here]()
+# Citation
+If you use these models in your research, please cite:
+```
+@software{terriel_bsicle_2026,
+  AUTHOR       = {Terriel, Lucas and Jolivet, Vincent},
+  TITLE        = {{BSICLE}: Binary System for Illuminated Folio Classification with Lightweight Engines},
+  YEAR         = {2026},
+  PUBLISHER    = {Hugging Face},
+  INSTITUTION  = {{École nationale des chartes -- PSL}},
+  URL          = {https://huggingface.co/ENC-PSL/medieval-illumination-bin-classifier},
+  NOTE         = {Family of lightweight binary image classification models for detecting illuminated folios in medieval manuscripts, developed in the context of the O.D.I.L. project},
+  LICENSE      = {apache-2.0},
+  VERSION      = {0.0.1}
+}
+```
+# Funding
+<div style="display: flex; align-items: center; justify-content: center; text-align: justify; gap: 20px; max-width: 800px; margin: auto;">
+    <img src="assets/odil-logo.png" width="200" alt="Logo ODIL" align="left">
+  <p style="text-align: justify; margin-top:-20px;">
+  <br>
+This models are developped at [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).
+    </p>
+    <br>
+</div>