Tara Models
ONNX models for Tara - browser-native visual document translation.
Models
| Model | File | Size | Source | Changes |
|---|---|---|---|---|
| RT-DETR-v2 (detector) | detector.onnx | 161 MB | ogkalu/comic-text-and-bubble-detector | Patched AveragePool ceil_mode=0 for WebGPU compatibility |
| LaMa (inpainter) | lama-manga-dynamic.onnx | 197 MB | ogkalu/lama-manga-onnx-dynamic | No changes, original model |
Why this repo?
The original detector.onnx uses AveragePool with ceil_mode=1, which onnxruntime-web's WebGPU backend does not support. This repo hosts a patched version where ceil_mode is set to 0. Since all AveragePool nodes use kernel=2x2 stride=2 on even-dimension inputs (640x640), this change has no effect on output.
The LaMa inpainter works as-is and is included here for convenience (single download source).
Usage
These models are loaded by Tara via the browser Cache API. On first visit, the app downloads both models (~360 MB total) and caches them locally. Subsequent visits load from cache instantly.
Direct download URLs:
- https://huggingface.co/QuatZo/tara-models/resolve/main/detector.onnx
- https://huggingface.co/QuatZo/tara-models/resolve/main/lama-manga-dynamic.onnx
License
Original models by ogkalu. See the source repositories for license information.