File size: 2,377 Bytes
7af3196
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---

library_name: onnxruntime
pipeline_tag: object-detection
license: mit
base_model: ds4sd/docling-models
tags:
  - table-structure-recognition
  - tableformer
  - docling
  - onnx
  - stepcache
  - kv-cache
---


# Docling TableFormer v1 — ONNX stepcache export

ONNX export of [Docling](https://github.com/DS4SD/docling)'s TableFormer v1 structure recognizer, **split into encoder + step-cached decoder + bbox-head sub-graphs** so the autoregressive decoder can be run one step at a time with a KV-cache from Python — without pulling in the Docling runtime.

## Why stepcache?

Docling's stock decoder runs the full sequence per call. For desktop CPU inference you want to cache K/V across decoder steps to amortize cost. This export materializes that pattern at the ONNX level so onnxruntime (or any ONNX runtime) handles it without custom Docling code.

## Files (`docling_tableformer_v1_stepcache_onnx/`)

| File | Role |
|---|---|
| `docling_v1_encoder.onnx` | Encodes the cropped table image once |
| `docling_v1_decoder_step.onnx` | One decoder step; consumes encoder features + previous KV |
| `docling_v1_bbox_head.onnx` | Maps decoder hidden states to per-cell bboxes |
| `vocab.json`, `tableformer_config.json` | Tokenizer + model config |

## Loading

```python

import onnxruntime as ort

from huggingface_hub import snapshot_download

local = snapshot_download("welcomyou/docling-tableformer-v1-onnx-stepcache", local_dir="models")

sub = f"{local}/docling_tableformer_v1_stepcache_onnx"

encoder = ort.InferenceSession(f"{sub}/docling_v1_encoder.onnx")

decoder = ort.InferenceSession(f"{sub}/docling_v1_decoder_step.onnx")

bbox    = ort.InferenceSession(f"{sub}/docling_v1_bbox_head.onnx")

```

A reference Python loop that ties these three sessions into a stepcache decoder lives at [train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/onnx_stepcache_runner_reference.py).

## Re-export reproduction

See [train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py](https://github.com/welcomyou/scanindex/blob/main/train-convert/docling-tableformer-v1/convert/export_docling_v1_tableformer_stepcache_onnx.py).

## License

MIT, inherited from Docling.