Docling TableFormer v1 β ONNX stepcache export
ONNX export of Docling's TableFormer v1 structure recognizer, split into encoder + step-cached decoder + bbox-head sub-graphs so the autoregressive decoder can be run one step at a time with a KV-cache from Python β without pulling in the Docling runtime.
Why stepcache?
Docling's stock decoder runs the full sequence per call. For desktop CPU inference you want to cache K/V across decoder steps to amortize cost. This export materializes that pattern at the ONNX level so onnxruntime (or any ONNX runtime) handles it without custom Docling code.
Files (docling_tableformer_v1_stepcache_onnx/)
| File | Role |
|---|---|
docling_v1_encoder.onnx |
Encodes the cropped table image once |
docling_v1_decoder_step.onnx |
One decoder step; consumes encoder features + previous KV |
docling_v1_bbox_head.onnx |
Maps decoder hidden states to per-cell bboxes |
vocab.json, tableformer_config.json |
Tokenizer + model config |
Loading
import onnxruntime as ort
from huggingface_hub import snapshot_download
local = snapshot_download("welcomyou/docling-tableformer-v1-onnx-stepcache", local_dir="models")
sub = f"{local}/docling_tableformer_v1_stepcache_onnx"
encoder = ort.InferenceSession(f"{sub}/docling_v1_encoder.onnx")
decoder = ort.InferenceSession(f"{sub}/docling_v1_decoder_step.onnx")
bbox = ort.InferenceSession(f"{sub}/docling_v1_bbox_head.onnx")
A reference Python loop that ties these three sessions into a stepcache decoder lives at train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py.
Re-export reproduction
See train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py.
License
MIT, inherited from Docling.
Model tree for welcomyou/docling-tableformer-v1-onnx-stepcache
Base model
docling-project/docling-models