welcomyou commited on
Commit
7af3196
·
verified ·
1 Parent(s): 4da48c1

docs: model card

Browse files
Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: onnxruntime
3
+ pipeline_tag: object-detection
4
+ license: mit
5
+ base_model: ds4sd/docling-models
6
+ tags:
7
+ - table-structure-recognition
8
+ - tableformer
9
+ - docling
10
+ - onnx
11
+ - stepcache
12
+ - kv-cache
13
+ ---
14
+
15
+ # Docling TableFormer v1 — ONNX stepcache export
16
+
17
+ ONNX export of [Docling](https://github.com/DS4SD/docling)'s TableFormer v1 structure recognizer, **split into encoder + step-cached decoder + bbox-head sub-graphs** so the autoregressive decoder can be run one step at a time with a KV-cache from Python — without pulling in the Docling runtime.
18
+
19
+ ## Why stepcache?
20
+
21
+ Docling's stock decoder runs the full sequence per call. For desktop CPU inference you want to cache K/V across decoder steps to amortize cost. This export materializes that pattern at the ONNX level so onnxruntime (or any ONNX runtime) handles it without custom Docling code.
22
+
23
+ ## Files (`docling_tableformer_v1_stepcache_onnx/`)
24
+
25
+ | File | Role |
26
+ |---|---|
27
+ | `docling_v1_encoder.onnx` | Encodes the cropped table image once |
28
+ | `docling_v1_decoder_step.onnx` | One decoder step; consumes encoder features + previous KV |
29
+ | `docling_v1_bbox_head.onnx` | Maps decoder hidden states to per-cell bboxes |
30
+ | `vocab.json`, `tableformer_config.json` | Tokenizer + model config |
31
+
32
+ ## Loading
33
+
34
+ ```python
35
+ import onnxruntime as ort
36
+ from huggingface_hub import snapshot_download
37
+ local = snapshot_download("welcomyou/docling-tableformer-v1-onnx-stepcache", local_dir="models")
38
+ sub = f"{local}/docling_tableformer_v1_stepcache_onnx"
39
+ encoder = ort.InferenceSession(f"{sub}/docling_v1_encoder.onnx")
40
+ decoder = ort.InferenceSession(f"{sub}/docling_v1_decoder_step.onnx")
41
+ bbox = ort.InferenceSession(f"{sub}/docling_v1_bbox_head.onnx")
42
+ ```
43
+
44
+ A reference Python loop that ties these three sessions into a stepcache decoder lives at [train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py).
45
+
46
+ ## Re-export reproduction
47
+
48
+ See [train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py).
49
+
50
+ ## License
51
+
52
+ MIT, inherited from Docling.