Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,91 @@
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: apache-2.0
|
| 3 |
+
library_name: onnx
|
| 4 |
+
tags:
|
| 5 |
+
- image-segmentation
|
| 6 |
+
- salient-object-detection
|
| 7 |
+
- background-removal
|
| 8 |
+
- u2net
|
| 9 |
+
- onnx
|
| 10 |
+
base_model: xuebinqin/U-2-Net
|
| 11 |
+
pipeline_tag: image-segmentation
|
| 12 |
+
language:
|
| 13 |
+
- en
|
| 14 |
---
|
| 15 |
+
|
| 16 |
+
# U²-Net — Salient Object Segmentation (ONNX)
|
| 17 |
+
|
| 18 |
+
ONNX checkpoints of [xuebinqin/U-2-Net](https://github.com/xuebinqin/U-2-Net) — a nested U-structure network for salient-object detection. Trained to separate the "main subject" of an image from the background. Pair the output mask with `image_cutout()` for background removal, or with `apply_colormap()` to visualize saliency.
|
| 19 |
+
|
| 20 |
+
Not converted locally — these are the official ONNX checkpoints, republished by [danielgatis/rembg](https://github.com/danielgatis/rembg) in a convenient release.
|
| 21 |
+
|
| 22 |
+
Credit: Xuebin Qin, Zichen Zhang, Chenyang Huang, Masood Dehghan, Osmar R. Zaiane, Martin Jagersand — University of Alberta (*"U²-Net: Going Deeper with Nested U-Structure for Salient Object Detection"*, Pattern Recognition 2020).
|
| 23 |
+
|
| 24 |
+
## What this repo contains
|
| 25 |
+
|
| 26 |
+
| File | Params | Size | Use |
|
| 27 |
+
|---|---|---|---|
|
| 28 |
+
| `u2netp.onnx` | 4.7M | ~4.7 MB | **Recommended default.** Distilled lite variant — CPU/mobile/edge-friendly |
|
| 29 |
+
| `u2net.onnx` | 176M | ~170 MB | Full network — sharper edges on hair, fur, lace, thin structures |
|
| 30 |
+
|
| 31 |
+
Both files share the same input/output tensor signature, so inference code is identical — you can swap variants without rewriting anything.
|
| 32 |
+
|
| 33 |
+
## Input / output
|
| 34 |
+
|
| 35 |
+
| | Spec |
|
| 36 |
+
|---|---|
|
| 37 |
+
| Input name | `input.1` (verify in Netron) |
|
| 38 |
+
| Input shape | `[1, 3, 320, 320]` (NCHW) |
|
| 39 |
+
| Input dtype | float32 |
|
| 40 |
+
| Input color order | **RGB** |
|
| 41 |
+
| Preprocessing | Resize to 320×320, scale to `[0,1]`, normalize with ImageNet stats: `mean=[0.485, 0.456, 0.406]`, `std=[0.229, 0.224, 0.225]` |
|
| 42 |
+
| Outputs | 7 tensors: `d0`..`d6`, saliency maps at decreasing resolution. **`d0` is the final fused mask** — the other six are intermediate supervisions used during training; ignore them at inference. |
|
| 43 |
+
| Output shape (per map) | `[1, 1, 320, 320]` |
|
| 44 |
+
| Output meaning | Per-pixel saliency in `[0, 1]` — higher = more likely to be the subject. Threshold (typically ~0.5) for a binary mask, or use raw values as a soft alpha. |
|
| 45 |
+
|
| 46 |
+
## How to use
|
| 47 |
+
|
| 48 |
+
```python
|
| 49 |
+
import onnxruntime as ort
|
| 50 |
+
import numpy as np
|
| 51 |
+
from PIL import Image
|
| 52 |
+
|
| 53 |
+
sess = ort.InferenceSession("u2netp.onnx") # or "u2net.onnx" — same signature
|
| 54 |
+
|
| 55 |
+
# Remember the original size so we can resize the mask back at the end
|
| 56 |
+
orig = Image.open("photo.jpg").convert("RGB")
|
| 57 |
+
W, H = orig.size
|
| 58 |
+
|
| 59 |
+
# Preprocess
|
| 60 |
+
img = orig.resize((320, 320), Image.BILINEAR)
|
| 61 |
+
arr = np.asarray(img, dtype=np.float32) / 255.0
|
| 62 |
+
arr = (arr - [0.485, 0.456, 0.406]) / [0.229, 0.224, 0.225]
|
| 63 |
+
arr = arr.transpose(2, 0, 1)[None, ...].astype(np.float32)
|
| 64 |
+
|
| 65 |
+
# Inference — outputs is a list of 7 tensors; d0 is index 0
|
| 66 |
+
outputs = sess.run(None, {sess.get_inputs()[0].name: arr})
|
| 67 |
+
d0 = outputs[0][0, 0] # 320x320 saliency
|
| 68 |
+
|
| 69 |
+
# Normalize (U²-Net outputs aren't strictly in [0,1] before squashing)
|
| 70 |
+
d0 = (d0 - d0.min()) / (d0.max() - d0.min() + 1e-8)
|
| 71 |
+
|
| 72 |
+
# Resize mask back to original image dimensions
|
| 73 |
+
mask = Image.fromarray((d0 * 255).astype(np.uint8)).resize((W, H), Image.BILINEAR)
|
| 74 |
+
```
|
| 75 |
+
|
| 76 |
+
For background removal: apply `mask` as the alpha channel to the original RGB image (RGBA cutout).
|
| 77 |
+
|
| 78 |
+
## Which one should I use?
|
| 79 |
+
|
| 80 |
+
- **`u2netp`** is the right default. 4.7 MB on disk, ~30 ms / image on CPU, mask quality good enough for >90% of background-removal and saliency-mapping use cases. Loads instantly.
|
| 81 |
+
- **`u2net`** earns its disk + latency cost on **fine-edge** subjects: hair, fur, lace, complex foliage, transparent objects. If the lite variant's edges look "blocky" on your inputs, the full model is the upgrade.
|
| 82 |
+
|
| 83 |
+
For interactive segmentation (clicks / boxes / prompts), pair with [MobileSAM](https://huggingface.co/Heliosoph/sam-onnx) instead — U²-Net is automatic / non-interactive.
|
| 84 |
+
|
| 85 |
+
## Excluded variant
|
| 86 |
+
|
| 87 |
+
The original [xuebinqin/U-2-Net](https://github.com/xuebinqin/U-2-Net) repo also ships a third checkpoint called `u2net_portrait` (line-drawing portrait sketches). It's **deliberately not bundled here** — it was trained on the APDrawing dataset, which carries non-commercial restrictions that would taint the otherwise-clean Apache-2.0 status of this bundle. If you need it, grab it directly from the upstream repo and read the dataset terms first.
|
| 88 |
+
|
| 89 |
+
## License
|
| 90 |
+
|
| 91 |
+
**Apache-2.0** — same as the upstream [xuebinqin/U-2-Net](https://github.com/xuebinqin/U-2-Net) repo. `LICENSE` file included. The danielgatis/rembg release just bundles the original weights; no relicensing occurred.
|