lterriel commited on
Commit
8bab642
·
verified ·
1 Parent(s): 7983f89

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +363 -0
README.md CHANGED
@@ -1,3 +1,366 @@
1
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: apache-2.0
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ library_name: onnxruntime
3
+ pipeline_tag: image-classification
4
+ tags:
5
+ - onnx
6
+ - image-classification
7
+ - medieval-manuscripts
8
+ - illumination-detection
9
+ - mobilenet
10
+ - mobilevit
11
+ - glam
12
+ - iiif
13
+ - cultural-heritage
14
+ - digital-humanities
15
+ - medieval-folio
16
+ - medieval
17
+ - medieval-illuminations
18
+ - MobileNet
19
+ - MobileVit
20
  license: apache-2.0
21
+ datasets:
22
+ - ENC-PSL/medieval-folio-illumination-bin-dataset
23
+ base_model:
24
+ - timm/mobilenetv3_small_100.lamb_in1k
25
+ - timm/mobilenetv3_large_100.ra_in1k
26
+ - timm/mobilenetv2_100.ra_in1k
27
+ - apple/mobilevitv2-1.0-imagenet1k-256
28
  ---
29
+
30
+ # BSICLE — Binary System for Illuminated Folio Classification with Lightweight Engines
31
+
32
+ BSICLE (pronounced bé-si-cle ; /be.zikl/) is a family of lightweight binary models for classify **illuminated folios in medieval manuscripts**. Theses models are developed at the [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).
33
+
34
+ These models classify manuscript pages as:
35
+
36
+ - **illuminated** (miniatures, historiated initials, decorated pages etc.)
37
+ - **non-illuminated** (plain text folio, printer marks, tables, cover, blank folios etc.)
38
+
39
+ # Use cases
40
+
41
+ Models are optimized to run **locally (CPU) or in the browser** using **edge-compatibility architecture** (MobileNet, MobileViT) and **ONNX inference** for exemple to build **IIIF filter pipelines** or to build **specialized corpora**.
42
+
43
+ :octocat: For an exemple of use check the demo web application [on github]() or [on hf spaces]()
44
+
45
+ # Models & Results
46
+
47
+ The finetuned models available in this repository are based on following architecture:
48
+
49
+ - [MobileNetV2](timm/mobilenetv2_100.ra_in1k)
50
+ - [MobileNetV3](timm/mobilenetv3_small_100.lamb_in1k) (small and large version)
51
+ - [MobileViT v2](apple/mobilevitv2-1.0-imagenet1k-256
52
+
53
+ | Architecture | Validation Accuracy | Test Accuracy | F1 | Precision | Recall | AUC |
54
+ |--------------|--------------------:|--------------:|---:|----------:|-------:|----:|
55
+ | MobileNetV2 | 1.0000 | 0.9776 | 0.9770 | 1.0000 | 0.9550 | 0.9993 |
56
+ | MobileNetV3 Small | 1.0000 | 0.9731 | 0.9727 | 0.9817 | 0.9640 | 0.9984 |
57
+ | MobileNetV3 Large | 1.0000 | 0.9865 | 0.9864 | 0.9909 | 0.9820 | 0.9992 |
58
+ | MobileViT v2 | 0.9955 | 0.9776 | 0.9770 | 1.0000 | 0.9550 | 0.9992 |
59
+
60
+ > This repository contains two model variants: models prefix with "final_" are finetuned with no test set
61
+
62
+ > These results should be interpreted with care. Although the models reach very high scores on the current splits, the task may be partially dataset-dependent.
63
+
64
+ # Labels
65
+
66
+ | Label ID | Label |
67
+ |---------|------|
68
+ | 0 | non_illuminated |
69
+ | 1 | illuminated |
70
+
71
+ # Dataset
72
+
73
+ Training data comes from: [ENC-PSL/odil-medieval-folio-illumination-bin-dataset](https://huggingface.co/datasets/ENC-PSL/odil-medieval-folio-illumination-bin-dataset)
74
+
75
+ ## Distribution of data
76
+
77
+ - illuminated
78
+ - train: 519
79
+ - dev : 112
80
+ - test : 111
81
+ - non_illuminated
82
+ - train: 519
83
+ - dev : 111
84
+ - test : 112
85
+
86
+ # What counts as "illuminated"?
87
+
88
+ ### Positive (illuminated)
89
+
90
+ Examples include:
91
+
92
+ - miniatures
93
+ - historiated initials
94
+ - decorative initials
95
+ - scientific diagrams
96
+ - maps
97
+ - decorated manuscript pages
98
+
99
+
100
+ ### Negative (non-illuminated)
101
+
102
+ Examples include:
103
+
104
+ - plain text folios
105
+ - marginal decorations without images
106
+ - printer marks
107
+ - tables
108
+ - cover
109
+ - blank folios
110
+ - rubricated text without illumination
111
+
112
+ ### Examples
113
+
114
+ | Illuminated | Not Illuminated |
115
+ |:-----------:|:---------------:|
116
+ | ![illuminated](illuminated_0.jpg) | ![not illuminated](not_illuminated_0.jpg) |
117
+ | | |
118
+
119
+ # Usage
120
+
121
+ ## Python — ONNX local
122
+
123
+ ```bash
124
+ pip install onnxruntime pillow numpy
125
+ ```
126
+
127
+ ```python
128
+ import json
129
+ import numpy as np
130
+ import onnxruntime as ort
131
+ from PIL import Image
132
+ from pathlib import Path
133
+
134
+ run = Path("./mobilenet_v3_large")
135
+
136
+ cfg = json.loads((run / "inference_config.json").read_text())
137
+ pre = json.loads((run / "preprocess.json").read_text())
138
+
139
+ img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
140
+ x = np.asarray(img).astype("float32") / 255.0
141
+ x = (x - np.array(pre["mean"])) / np.array(pre["std"])
142
+ x = x.transpose(2, 0, 1)[None].astype("float32")
143
+
144
+ sess = ort.InferenceSession(str(run / "onnx/model.onnx"))
145
+ logits = sess.run(None, {cfg["input_name"]: x})[0][0]
146
+
147
+ probs = np.exp(logits - logits.max())
148
+ probs = probs / probs.sum()
149
+
150
+ p_illu = float(probs[cfg["positive_index"]])
151
+ label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
152
+
153
+ print(label, p_illu)
154
+ ```
155
+
156
+ ## Python — ONNX from Hugging Face
157
+
158
+ ```bash
159
+ pip install huggingface_hub onnxruntime pillow numpy
160
+ ```
161
+
162
+ ```python
163
+ from huggingface_hub import snapshot_download
164
+ from pathlib import Path
165
+
166
+ repo = "lterriel/medieval-illumination-bin-classifier"
167
+ run_name = "final_mobilenetv3_large"
168
+ local_dir = Path(snapshot_download(
169
+ repo_id=repo,
170
+ allow_patterns=[
171
+ f"{run_name}/onnx/model.onnx",
172
+ f"{run_name}/preprocess.json",
173
+ f"{run_name}/inference_config.json",
174
+ ],
175
+ )) / run_name
176
+ ```
177
+
178
+ Then use the same ONNX code as above, replacing:
179
+
180
+ ```python
181
+ run = Path("./mobilenet_v3_large")
182
+ ```
183
+
184
+ with:
185
+
186
+ ```python
187
+ run = local_dir
188
+ ```
189
+
190
+ ## Python — PyTorch / non-ONNX local
191
+
192
+ ```bash
193
+ pip install torch torchvision pillow numpy
194
+ ```
195
+
196
+ ```python
197
+ import json
198
+ import torch
199
+ import numpy as np
200
+ from PIL import Image
201
+ from pathlib import Path
202
+ from torchvision import models
203
+
204
+ run = Path("./mobilenet_v3_large")
205
+
206
+ cfg = json.loads((run / "inference_config.json").read_text())
207
+ pre = json.loads((run / "preprocess.json").read_text())
208
+
209
+ model = models.mobilenet_v3_large(weights=None)
210
+ model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
211
+ model.load_state_dict(torch.load(run / "checkpoints/best.pt", map_location="cpu"))
212
+ model.eval()
213
+
214
+ img = Image.open("page.jpg").convert("RGB").resize((pre["img_size"], pre["img_size"]))
215
+ x = np.asarray(img).astype("float32") / 255.0
216
+ x = (x - np.array(pre["mean"])) / np.array(pre["std"])
217
+ x = torch.tensor(x.transpose(2, 0, 1)[None]).float()
218
+
219
+ with torch.no_grad():
220
+ logits = model(x)
221
+ probs = torch.softmax(logits, dim=1)[0]
222
+
223
+ p_illu = float(probs[cfg["positive_index"]])
224
+ label = cfg["positive_label"] if p_illu >= cfg["threshold"] else "non_illumination"
225
+
226
+ print(label, p_illu)
227
+ ```
228
+
229
+ For another torchvision architecture, replace the model constructor:
230
+
231
+ - mobilenetV2
232
+
233
+ ```
234
+ model = models.mobilenet_v2(weights=None)
235
+ model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
236
+ ```
237
+
238
+ - mobilenetV2 (small)
239
+
240
+ ```
241
+ model = models.mobilenet_v3_small(weights=None)
242
+ model.classifier[-1] = torch.nn.Linear(model.classifier[-1].in_features, 2)
243
+ ```
244
+
245
+ ## Python — PyTorch / non-ONNX from Hugging Face
246
+
247
+ ```bash
248
+ pip install huggingface_hub torch torchvision pillow numpy
249
+ ```
250
+
251
+ ```python
252
+ from huggingface_hub import snapshot_download
253
+ from pathlib import Path
254
+
255
+ repo = "lterriel/medieval-illumination-bin-classifier"
256
+ run_name = "final_mobilenetv3_large"
257
+
258
+ run = Path(snapshot_download(
259
+ repo_id=repo,
260
+ allow_patterns=[
261
+ f"{run_name}/checkpoints/best.pt",
262
+ f"{run_name}/preprocess.json",
263
+ f"{run_name}/inference_config.json",
264
+ ],
265
+ )) / run_name
266
+ ```
267
+
268
+ Then use the same PyTorch code as above.
269
+
270
+ ## JS (HF - ONNX)
271
+
272
+ ```javascript
273
+ <script src="https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js"></script>
274
+ <input type="file" id="file" accept="image/*">
275
+ <pre id="out"></pre>
276
+
277
+ <script type="module">
278
+ const run = "https://huggingface.co/lterriel/medieval-illumination-bin-classifier/resolve/main/final_mobilenetv3_large";
279
+
280
+ const cfg = await fetch(`${run}/inference_config.json`).then(r => r.json());
281
+ const pre = await fetch(`${run}/preprocess.json`).then(r => r.json());
282
+ const sess = await ort.InferenceSession.create(`${run}/onnx/model.onnx`);
283
+
284
+ function softmax(a) {
285
+ const m = Math.max(...a);
286
+ const e = a.map(x => Math.exp(x - m));
287
+ const s = e.reduce((x, y) => x + y, 0);
288
+ return e.map(x => x / s);
289
+ }
290
+
291
+ async function imageToTensor(file) {
292
+ const img = new Image();
293
+ img.src = URL.createObjectURL(file);
294
+ await img.decode();
295
+
296
+ const size = pre.img_size;
297
+ const canvas = document.createElement("canvas");
298
+ canvas.width = size;
299
+ canvas.height = size;
300
+
301
+ const ctx = canvas.getContext("2d");
302
+ ctx.drawImage(img, 0, 0, size, size);
303
+
304
+ const data = ctx.getImageData(0, 0, size, size).data;
305
+ const x = new Float32Array(1 * 3 * size * size);
306
+
307
+ for (let i = 0, p = 0; i < data.length; i += 4, p++) {
308
+ x[p] = (data[i] / 255 - pre.mean[0]) / pre.std[0];
309
+ x[size * size + p] = (data[i + 1] / 255 - pre.mean[1]) / pre.std[1];
310
+ x[2 * size * size + p] = (data[i + 2] / 255 - pre.mean[2]) / pre.std[2];
311
+ }
312
+
313
+ return new ort.Tensor("float32", x, [1, 3, size, size]);
314
+ }
315
+
316
+ document.querySelector("#file").onchange = async (e) => {
317
+ const tensor = await imageToTensor(e.target.files[0]);
318
+ const res = await sess.run({ [cfg.input_name]: tensor });
319
+
320
+ const logits = Array.from(res[cfg.output_name].data);
321
+ const probs = softmax(logits);
322
+
323
+ const pIllu = probs[cfg.positive_index];
324
+ const label = pIllu >= cfg.threshold ? cfg.positive_label : "non_illumination";
325
+
326
+ document.querySelector("#out").textContent = JSON.stringify({
327
+ label,
328
+ p_illumination: pIllu,
329
+ probs
330
+ }, null, 2);
331
+ };
332
+ </script>
333
+ ```
334
+
335
+ # Training tools
336
+
337
+ All models are finetuned with img-clf-framework, a training framework for binary image classification pipelines. Check the [training repository here]()
338
+
339
+ # Citation
340
+
341
+ If you use these models in your research, please cite:
342
+
343
+ ```
344
+ @software{terriel_bsicle_2026,
345
+ AUTHOR = {Terriel, Lucas and Jolivet, Vincent},
346
+ TITLE = {{BSICLE}: Binary System for Illuminated Folio Classification with Lightweight Engines},
347
+ YEAR = {2026},
348
+ PUBLISHER = {Hugging Face},
349
+ INSTITUTION = {{École nationale des chartes -- PSL}},
350
+ URL = {https://huggingface.co/ENC-PSL/medieval-illumination-bin-classifier},
351
+ NOTE = {Family of lightweight binary image classification models for detecting illuminated folios in medieval manuscripts, developed in the context of the O.D.I.L. project},
352
+ LICENSE = {apache-2.0},
353
+ VERSION = {0.0.1}
354
+ }
355
+ ```
356
+
357
+ # Funding
358
+
359
+ <div style="display: flex; align-items: center; justify-content: center; text-align: justify; gap: 20px; max-width: 800px; margin: auto;">
360
+ <img src="assets/odil-logo.png" width="200" alt="Logo ODIL" align="left">
361
+ <p style="text-align: justify; margin-top:-20px;">
362
+ <br>
363
+ This models are developped at [École nationale des chartes – PSL](https://www.chartes.psl.eu/) in context of [O.D.I.L. project](https://projet.biblissima.fr/fr/appels-projets/projets-retenus/odil-objet-detection-illuminations).
364
+ </p>
365
+ <br>
366
+ </div>