File size: 4,270 Bytes
cbb7474
 
036c066
 
 
 
 
 
 
 
 
 
 
 
cbb7474
036c066
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
---
license: apache-2.0
library_name: onnx
tags:
- image-restoration
- super-resolution
- image-denoising
- swinir
- transformer
- onnx
base_model: JingyunLiang/SwinIR
pipeline_tag: image-to-image
language:
- en
---

# SwinIR β€” Image Restoration (ONNX)

ONNX exports of [SwinIR](https://github.com/JingyunLiang/SwinIR) β€” Swin Transformer for Image Restoration. Two variants covering the two most common SwinIR use cases: real-world super-resolution and color denoising.

Re-exported from upstream PyTorch weights. Provenance trail: Liang et al. β†’ JingyunLiang/SwinIR (cloned source) + pinned `.pth` checkpoints from the v0.0 GitHub release β†’ `torch.onnx.export` (one pass per variant) β†’ these files.

Toolchain: `torch 2.4.x` (CUDA 12.4), `timm` latest, `onnx` latest, `onnxruntime>=1.17`, opset 17, `do_constant_folding=True`, `dynamo=False` (forces the legacy TorchScript-based exporter; SwinIR's `.type_as()` buffer coercions trip the dynamo path's name-lineage tracking on torch >=2.5). Full conversion script: [`scripts/export-swinir.ps1`](https://github.com/HeliosophLLC/DatumIngest/blob/main/scripts/export-swinir.ps1) in the DatumIngest repo.

Credit: Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, Radu Timofte (ETH Zurich and collaborators). Paper: *"SwinIR: Image Restoration Using Swin Transformer"*, ICCV 2021.

## What this repo contains

| File | Variant | Input β†’ Output | Use |
|---|---|---|---|
| `swinir_realsr_x4.onnx` | SwinIR-L real-SR (4Γ—) | 64Γ—64 RGB β†’ 256Γ—256 RGB | Real-world image super-resolution (handles compression artifacts, sensor noise, mild blur as a side effect). ~110 MB. |
| `swinir_denoising_color_25.onnx` | SwinIR-M color DN | 128Γ—128 RGB β†’ 128Γ—128 RGB | Color denoising at Gaussian noise Οƒ=25 β€” the standard denoising-benchmark reference. ~45 MB. |

Both files share the same general I/O signature (NCHW float32 RGB in `[0, 1]`) β€” only the spatial dims differ.

## Input / output

| | `swinir_realsr_x4.onnx` | `swinir_denoising_color_25.onnx` |
|---|---|---|
| Input name | `image` | `image` |
| Input shape | `[batch, 3, 64, 64]` | `[batch, 3, 128, 128]` |
| Input dtype | float32 | float32 |
| Input range | `[0, 1]` RGB | `[0, 1]` RGB |
| Output name | `upscaled` | `denoised` |
| Output shape | `[batch, 3, 256, 256]` | `[batch, 3, 128, 128]` |
| Dynamic axes | batch only | batch only |

Spatial dims are **fixed** by design β€” SwinIR's windowed attention is brittle under dynamic H/W in ONNX Runtime's window-shift op. To process larger images, tile the input into 64Γ—64 (SR) or 128Γ—128 (DN) patches with some overlap, run inference per tile, and stitch the outputs.

## How to use

```python
import onnxruntime as ort
import numpy as np
from PIL import Image

# Pick the variant
sess = ort.InferenceSession("swinir_denoising_color_25.onnx")
# or:
# sess = ort.InferenceSession("swinir_realsr_x4.onnx")

img = Image.open("noisy.jpg").convert("RGB").resize((128, 128))
arr = np.asarray(img, dtype=np.float32) / 255.0          # HWC, [0,1]
arr = arr.transpose(2, 0, 1)[None, ...]                  # 1x3xHxW

result = sess.run(None, {"image": arr.astype(np.float32)})[0][0]
result = np.clip(result, 0.0, 1.0).transpose(1, 2, 0)    # back to HWC
result_img = Image.fromarray((result * 255).astype(np.uint8))
```

For larger images, see the upstream [`main_test_swinir.py`](https://github.com/JingyunLiang/SwinIR/blob/main/main_test_swinir.py) for a reference tiling implementation.

## Which one should I use?

- **`swinir_denoising_color_25.onnx`** β€” when you specifically want the Gaussian Οƒ=25 reference denoiser (research papers, benchmark reproduction, comparing against other denoisers).
- **`swinir_realsr_x4.onnx`** β€” when you want 4Γ— super-resolution on real-world photos and don't mind that it'll also clean up some noise / compression artifacts in the process.

For **blind real-world denoising** (unknown noise level), [SCUNet](https://huggingface.co/Heliosoph/scunet-onnx) is the better fit β€” SwinIR's `noise25` variant is trained for a specific noise level and degrades when the input noise pattern differs.

## License

**Apache-2.0** β€” same as the upstream [`JingyunLiang/SwinIR`](https://github.com/JingyunLiang/SwinIR) repo. `LICENSE` file included.