welcomyou
/

docling-tableformer-v1-onnx-stepcache

Object Detection

table-structure-recognition

Model card Files Files and versions

welcomyou commited on 2 days ago

Commit

7af3196

·

verified ·

1 Parent(s): 4da48c1

docs: model card

Files changed (1) hide show

README.md +52 -0

README.md ADDED Viewed

	@@ -0,0 +1,52 @@

+---
+library_name: onnxruntime
+pipeline_tag: object-detection
+license: mit
+base_model: ds4sd/docling-models
+tags:
+  - table-structure-recognition
+  - tableformer
+  - docling
+  - onnx
+  - stepcache
+  - kv-cache
+---
+# Docling TableFormer v1 — ONNX stepcache export
+ONNX export of [Docling](https://github.com/DS4SD/docling)'s TableFormer v1 structure recognizer, **split into encoder + step-cached decoder + bbox-head sub-graphs** so the autoregressive decoder can be run one step at a time with a KV-cache from Python — without pulling in the Docling runtime.
+## Why stepcache?
+Docling's stock decoder runs the full sequence per call. For desktop CPU inference you want to cache K/V across decoder steps to amortize cost. This export materializes that pattern at the ONNX level so onnxruntime (or any ONNX runtime) handles it without custom Docling code.
+## Files (`docling_tableformer_v1_stepcache_onnx/`)
+| File | Role |
+|---|---|
+| `docling_v1_encoder.onnx` | Encodes the cropped table image once |
+| `docling_v1_decoder_step.onnx` | One decoder step; consumes encoder features + previous KV |
+| `docling_v1_bbox_head.onnx` | Maps decoder hidden states to per-cell bboxes |
+| `vocab.json`, `tableformer_config.json` | Tokenizer + model config |
+## Loading
+```python
+import onnxruntime as ort
+from huggingface_hub import snapshot_download
+local = snapshot_download("welcomyou/docling-tableformer-v1-onnx-stepcache", local_dir="models")
+sub = f"{local}/docling_tableformer_v1_stepcache_onnx"
+encoder = ort.InferenceSession(f"{sub}/docling_v1_encoder.onnx")
+decoder = ort.InferenceSession(f"{sub}/docling_v1_decoder_step.onnx")
+bbox    = ort.InferenceSession(f"{sub}/docling_v1_bbox_head.onnx")
+```
+A reference Python loop that ties these three sessions into a stepcache decoder lives at [train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py).
+## Re-export reproduction
+See [train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py).
+## License
+MIT, inherited from Docling.