| ---
|
| library_name: onnxruntime
|
| pipeline_tag: object-detection
|
| license: mit
|
| base_model: ds4sd/docling-models
|
| tags:
|
| - table-structure-recognition
|
| - tableformer
|
| - docling
|
| - onnx
|
| - stepcache
|
| - kv-cache
|
| ---
|
|
|
| # Docling TableFormer v1 — ONNX stepcache export
|
|
|
| ONNX export of [Docling](https://github.com/DS4SD/docling)'s TableFormer v1 structure recognizer, **split into encoder + step-cached decoder + bbox-head sub-graphs** so the autoregressive decoder can be run one step at a time with a KV-cache from Python — without pulling in the Docling runtime.
|
|
|
| ## Why stepcache?
|
|
|
| Docling's stock decoder runs the full sequence per call. For desktop CPU inference you want to cache K/V across decoder steps to amortize cost. This export materializes that pattern at the ONNX level so onnxruntime (or any ONNX runtime) handles it without custom Docling code.
|
|
|
| ## Files (`docling_tableformer_v1_stepcache_onnx/`)
|
|
|
| | File | Role |
|
| |---|---|
|
| | `docling_v1_encoder.onnx` | Encodes the cropped table image once |
|
| | `docling_v1_decoder_step.onnx` | One decoder step; consumes encoder features + previous KV |
|
| | `docling_v1_bbox_head.onnx` | Maps decoder hidden states to per-cell bboxes |
|
| | `vocab.json`, `tableformer_config.json` | Tokenizer + model config |
|
|
|
| ## Loading
|
|
|
| ```python
|
| import onnxruntime as ort
|
| from huggingface_hub import snapshot_download
|
| local = snapshot_download("welcomyou/docling-tableformer-v1-onnx-stepcache", local_dir="models")
|
| sub = f"{local}/docling_tableformer_v1_stepcache_onnx"
|
| encoder = ort.InferenceSession(f"{sub}/docling_v1_encoder.onnx")
|
| decoder = ort.InferenceSession(f"{sub}/docling_v1_decoder_step.onnx")
|
| bbox = ort.InferenceSession(f"{sub}/docling_v1_bbox_head.onnx")
|
| ```
|
|
|
| A reference Python loop that ties these three sessions into a stepcache decoder lives at [train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py).
|
|
|
| ## Re-export reproduction
|
|
|
| See [train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py).
|
|
|
| ## License
|
|
|
| MIT, inherited from Docling.
|
|
|