Commit ·
b5e53f5
1
Parent(s): e1b6f74
Add inference script, model, and project setup
Browse filesCo-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- .gitattributes +1 -0
- .gitignore +9 -0
- README.md +112 -3
- inference.py +164 -0
- model.ts +3 -0
- requirements.txt +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
model.ts filter=lfs diff=lfs merge=lfs -text
|
.gitignore
ADDED
|
@@ -0,0 +1,9 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Test outputs
|
| 2 |
+
tests/
|
| 3 |
+
*.webp
|
| 4 |
+
*.jpg
|
| 5 |
+
*.png
|
| 6 |
+
|
| 7 |
+
# Environment
|
| 8 |
+
.env
|
| 9 |
+
.env.local
|
README.md
CHANGED
|
@@ -1,3 +1,112 @@
|
|
| 1 |
-
|
| 2 |
-
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
# Fast Watermark Removal
|
| 2 |
+
|
| 3 |
+
A high-performance TorchScript model for removing watermarks from images. This model uses a dual-stage architecture optimized for speed and quality.
|
| 4 |
+
|
| 5 |
+
## Features
|
| 6 |
+
|
| 7 |
+
- **Fast inference**: ~500ms per image (RTX 4090)
|
| 8 |
+
- **High quality**: Preserves image details while effectively removing watermarks
|
| 9 |
+
- **Production-ready**: Compiled TorchScript model, no training code needed
|
| 10 |
+
- **Memory efficient**: Requires 11.5GB VRAM
|
| 11 |
+
|
| 12 |
+
## Limitations
|
| 13 |
+
|
| 14 |
+
- **Output resolution**: Limited to 768px maximum dimension (aspect ratio preserved)
|
| 15 |
+
|
| 16 |
+
## Commercial License
|
| 17 |
+
|
| 18 |
+
A commercial license with **1536px maximum output resolution** is available for production use. The 1536px model maintains identical:
|
| 19 |
+
|
| 20 |
+
- VRAM requirements (11.5GB)
|
| 21 |
+
- Inference times (~500ms)
|
| 22 |
+
- Image Output
|
| 23 |
+
|
| 24 |
+
**Contact**: contact by email for commercial licensing inquiries
|
| 25 |
+
|
| 26 |
+
## Installation
|
| 27 |
+
|
| 28 |
+
### Requirements
|
| 29 |
+
|
| 30 |
+
- Python 3.10+
|
| 31 |
+
- CUDA-capable GPU with 11.5GB+ VRAM
|
| 32 |
+
- PyTorch 2.0+
|
| 33 |
+
|
| 34 |
+
### Setup
|
| 35 |
+
|
| 36 |
+
```bash
|
| 37 |
+
# Clone the repository
|
| 38 |
+
git clone https://huggingface.co/[your-username]/remove-watermarks-fast
|
| 39 |
+
cd remove-watermarks-fast
|
| 40 |
+
|
| 41 |
+
# Install dependencies
|
| 42 |
+
pip install -r requirements.txt
|
| 43 |
+
```
|
| 44 |
+
|
| 45 |
+
## Usage
|
| 46 |
+
|
| 47 |
+
### Single Image
|
| 48 |
+
|
| 49 |
+
```bash
|
| 50 |
+
python inference.py -i /path/to/watermarked/image.jpg -m model.ts -o output_folder
|
| 51 |
+
```
|
| 52 |
+
|
| 53 |
+
### Batch Processing
|
| 54 |
+
|
| 55 |
+
```bash
|
| 56 |
+
python inference.py -f /path/to/images/folder -m model.ts -o output_folder
|
| 57 |
+
```
|
| 58 |
+
|
| 59 |
+
### Arguments
|
| 60 |
+
|
| 61 |
+
- `-i, --image`: Path to single input watermarked image
|
| 62 |
+
- `-f, --folder`: Path to folder containing watermarked images (processes all .jpg and .webp files)
|
| 63 |
+
- `-m, --model_path`: Path to TorchScript model file (required)
|
| 64 |
+
- `-o, --output_folder`: Output folder for results (default: `tests`)
|
| 65 |
+
|
| 66 |
+
### Output
|
| 67 |
+
|
| 68 |
+
The script saves two files per input:
|
| 69 |
+
|
| 70 |
+
1. **Original image**: Copied to output folder with original filename
|
| 71 |
+
2. **Clean image**: Saved as WebP with `-clean.webp` suffix
|
| 72 |
+
|
| 73 |
+
Images are automatically resized to maintain aspect ratio while respecting the 768px maximum dimension.
|
| 74 |
+
|
| 75 |
+
## How It Works
|
| 76 |
+
|
| 77 |
+
The model uses a two-stage pipeline:
|
| 78 |
+
|
| 79 |
+
1. **Stage 1**: Removes 90-95% of watermarks
|
| 80 |
+
2. **Stage 2**: Removes remaining watermarks
|
| 81 |
+
3. **Post-processing**: Automatic resizing to original aspect ratio (capped at 768px)
|
| 82 |
+
|
| 83 |
+
All processing (including resizing and normalization) is performed within the compiled TorchScript model for optimal performance.
|
| 84 |
+
|
| 85 |
+
## Performance
|
| 86 |
+
|
| 87 |
+
- **GPU**: NVIDIA RTX 3090 / A6000 or equivalent
|
| 88 |
+
- **VRAM**: 11.5GB required
|
| 89 |
+
- **Speed**: ~500ms per image (768px output)
|
| 90 |
+
- **Batch size**: 1 (optimized for low latency)
|
| 91 |
+
|
| 92 |
+
## Future Improvements
|
| 93 |
+
|
| 94 |
+
I'm actively exploring ways to enhance this model's capabilities. If you have suggestions, encounter issues, or are interested in collaborating on improvements, please reach out!
|
| 95 |
+
|
| 96 |
+
## Technical Details
|
| 97 |
+
|
| 98 |
+
- **Architecture**: Dual-stage with Swin2 Transformers
|
| 99 |
+
- **Format**: TorchScript (.ts) compiled model
|
| 100 |
+
- **Input**: RGB images (any resolution)
|
| 101 |
+
- **Output**: RGB images (max 768px, aspect ratio preserved)
|
| 102 |
+
- **Precision**: FP32 with TensorFloat32 matmul on Ampere+ GPUs
|
| 103 |
+
|
| 104 |
+
## License
|
| 105 |
+
|
| 106 |
+
This model is provided for **non-commercial research and personal use only**. For commercial applications, please contact by email for licensing options.
|
| 107 |
+
|
| 108 |
+
## Support
|
| 109 |
+
|
| 110 |
+
- **Issues**: Open an issue on the HuggingFace repository
|
| 111 |
+
- **Questions**: jason@engageify.com
|
| 112 |
+
- **Commercial licensing**: jason@engageify.com
|
inference.py
ADDED
|
@@ -0,0 +1,164 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
import argparse
|
| 2 |
+
import os
|
| 3 |
+
import sys
|
| 4 |
+
import glob
|
| 5 |
+
import time
|
| 6 |
+
from pathlib import Path
|
| 7 |
+
from PIL import Image
|
| 8 |
+
import torch
|
| 9 |
+
import torchvision.transforms as T
|
| 10 |
+
|
| 11 |
+
# Output resolution is capped at 768px
|
| 12 |
+
|
| 13 |
+
|
| 14 |
+
def parse_args():
|
| 15 |
+
parser = argparse.ArgumentParser(description="TorchScript Pipeline Inference for Watermark Removal")
|
| 16 |
+
group = parser.add_mutually_exclusive_group(required=True)
|
| 17 |
+
group.add_argument('-i', '--image', type=str, help="Path to single input watermarked image")
|
| 18 |
+
group.add_argument('-f', '--folder', type=str, help="Path to folder containing watermarked images")
|
| 19 |
+
parser.add_argument('-o', '--output_folder', type=str, default='tests', help="Output folder to save original and clean images")
|
| 20 |
+
parser.add_argument('-m', '--model_path', type=str, default='model.ts', help="Path to TorchScript pipeline model (.ts file)")
|
| 21 |
+
return parser.parse_args()
|
| 22 |
+
|
| 23 |
+
|
| 24 |
+
def calculate_output_dimensions(orig_width, orig_height, max_size):
|
| 25 |
+
"""
|
| 26 |
+
Calculate output dimensions maintaining original aspect ratio.
|
| 27 |
+
Caps at max_size (never upscale beyond processing size).
|
| 28 |
+
"""
|
| 29 |
+
# If image fits within max_size, keep original dimensions
|
| 30 |
+
if orig_width <= max_size and orig_height <= max_size:
|
| 31 |
+
return (orig_width, orig_height)
|
| 32 |
+
|
| 33 |
+
# Scale down to fit within max_size, maintaining aspect ratio
|
| 34 |
+
if orig_width >= orig_height:
|
| 35 |
+
output_width = max_size
|
| 36 |
+
output_height = int(orig_height * (max_size / orig_width))
|
| 37 |
+
else:
|
| 38 |
+
output_height = max_size
|
| 39 |
+
output_width = int(orig_width * (max_size / orig_height))
|
| 40 |
+
|
| 41 |
+
return (output_width, output_height)
|
| 42 |
+
|
| 43 |
+
|
| 44 |
+
def load_torchscript_model(model_path):
|
| 45 |
+
"""Load TorchScript pipeline model."""
|
| 46 |
+
device = torch.device('cuda')
|
| 47 |
+
|
| 48 |
+
print(f"Loading TorchScript pipeline from: {model_path}")
|
| 49 |
+
model = torch.jit.load(model_path, map_location=device)
|
| 50 |
+
model.eval()
|
| 51 |
+
|
| 52 |
+
return model, device
|
| 53 |
+
|
| 54 |
+
|
| 55 |
+
def process_image(img_path, model, device, output_folder=None):
|
| 56 |
+
# Load image and get original size
|
| 57 |
+
img = Image.open(img_path).convert('RGB')
|
| 58 |
+
orig_width, orig_height = img.size
|
| 59 |
+
|
| 60 |
+
base_name = os.path.basename(img_path)
|
| 61 |
+
print(f" [{base_name}] Original: {orig_width}x{orig_height}", end="")
|
| 62 |
+
|
| 63 |
+
# Convert to tensor [1, 3, H, W] in [0, 1] range
|
| 64 |
+
img_tensor = T.ToTensor()(img).unsqueeze(0).to(device)
|
| 65 |
+
|
| 66 |
+
# Inference with TorchScript pipeline
|
| 67 |
+
# Pipeline handles: resize → normalize → model1 → model2 → denormalize → final resize
|
| 68 |
+
with torch.no_grad():
|
| 69 |
+
pred_t = model(img_tensor) # Output: [1, 3, final_size, final_size] in [0, 1]
|
| 70 |
+
|
| 71 |
+
# Get output size from pipeline
|
| 72 |
+
_, _, pipeline_size, _ = pred_t.shape
|
| 73 |
+
print(f" → Pipeline output: {pipeline_size}x{pipeline_size}", end="")
|
| 74 |
+
|
| 75 |
+
# Convert tensor to PIL (square output at pipeline_size)
|
| 76 |
+
pred_img = T.ToPILImage()(pred_t.squeeze(0).cpu())
|
| 77 |
+
|
| 78 |
+
# Resize back to original dimensions using PIL LANCZOS (capped at pipeline_size)
|
| 79 |
+
output_width, output_height = calculate_output_dimensions(orig_width, orig_height, pipeline_size)
|
| 80 |
+
pred_img = pred_img.resize((output_width, output_height), resample=Image.LANCZOS)
|
| 81 |
+
print(f" → Resized: {output_width}x{output_height}", end="")
|
| 82 |
+
|
| 83 |
+
output_width, output_height = pred_img.size
|
| 84 |
+
print(f" → Output: {output_width}x{output_height}")
|
| 85 |
+
|
| 86 |
+
# Determine save paths
|
| 87 |
+
base_name = os.path.splitext(os.path.basename(img_path))[0]
|
| 88 |
+
clean_name = f"{base_name}-clean.webp"
|
| 89 |
+
|
| 90 |
+
# Create output folder and save both original and clean versions
|
| 91 |
+
os.makedirs(output_folder, exist_ok=True)
|
| 92 |
+
|
| 93 |
+
# Save original in output folder (keeps original extension)
|
| 94 |
+
orig_save_path = os.path.join(output_folder, os.path.basename(img_path))
|
| 95 |
+
img.save(orig_save_path)
|
| 96 |
+
|
| 97 |
+
# Save clean version (webp format with -clean suffix)
|
| 98 |
+
clean_path = os.path.join(output_folder, clean_name)
|
| 99 |
+
pred_img.save(clean_path, 'WEBP', quality=95)
|
| 100 |
+
|
| 101 |
+
|
| 102 |
+
def main():
|
| 103 |
+
# Enable TensorFloat32 for faster matmul on Ampere+ GPUs
|
| 104 |
+
torch.set_float32_matmul_precision('high')
|
| 105 |
+
|
| 106 |
+
args = parse_args()
|
| 107 |
+
|
| 108 |
+
# Verify TorchScript model exists
|
| 109 |
+
if not os.path.exists(args.model_path):
|
| 110 |
+
print(f"Error: TorchScript model not found: {args.model_path}")
|
| 111 |
+
return
|
| 112 |
+
|
| 113 |
+
print(f"TorchScript Pipeline Inference")
|
| 114 |
+
print(f"Model: {args.model_path}")
|
| 115 |
+
print()
|
| 116 |
+
|
| 117 |
+
# Load TorchScript pipeline once
|
| 118 |
+
model, device = load_torchscript_model(args.model_path)
|
| 119 |
+
print(f"Pipeline loaded on {device}")
|
| 120 |
+
print()
|
| 121 |
+
|
| 122 |
+
num_images = 0
|
| 123 |
+
|
| 124 |
+
# Determine output folder based on processing mode
|
| 125 |
+
if args.image:
|
| 126 |
+
# Single image: save directly in output_folder
|
| 127 |
+
output_path = args.output_folder
|
| 128 |
+
|
| 129 |
+
# Start timing AFTER model loading
|
| 130 |
+
start_time = time.time()
|
| 131 |
+
|
| 132 |
+
process_image(args.image, model, device, output_path)
|
| 133 |
+
num_images = 1
|
| 134 |
+
elif args.folder:
|
| 135 |
+
# Folder processing: create subfolder {model_name}_{folder_name}_ts
|
| 136 |
+
model_name = os.path.splitext(os.path.basename(args.model_path))[0]
|
| 137 |
+
folder_name = os.path.basename(os.path.normpath(args.folder))
|
| 138 |
+
subfolder_name = f"{model_name}_{folder_name}_ts"
|
| 139 |
+
output_path = os.path.join(args.output_folder, subfolder_name)
|
| 140 |
+
|
| 141 |
+
print(f"Saving outputs to: {output_path}")
|
| 142 |
+
print()
|
| 143 |
+
|
| 144 |
+
# Process all JPG/WebP in folder
|
| 145 |
+
patterns = ['*.jpg', '*.webp']
|
| 146 |
+
images = []
|
| 147 |
+
for pattern in patterns:
|
| 148 |
+
images.extend(glob.glob(os.path.join(args.folder, pattern)))
|
| 149 |
+
|
| 150 |
+
num_images = len(images)
|
| 151 |
+
|
| 152 |
+
# Start timing AFTER model loading
|
| 153 |
+
start_time = time.time()
|
| 154 |
+
|
| 155 |
+
for img_path in sorted(images):
|
| 156 |
+
process_image(img_path, model, device, output_path)
|
| 157 |
+
|
| 158 |
+
# Print total processing time
|
| 159 |
+
elapsed_time = time.time() - start_time
|
| 160 |
+
print(f"\nProcessed {num_images} image{'s' if num_images != 1 else ''} in {elapsed_time:.2f} seconds ({elapsed_time/num_images:.2f}s per image)")
|
| 161 |
+
|
| 162 |
+
|
| 163 |
+
if __name__ == '__main__':
|
| 164 |
+
main()
|
model.ts
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:fd2383e89c2e074035e9b87dd3abc9735a694fe8391d3423715e639200960d41
|
| 3 |
+
size 2144360989
|
requirements.txt
ADDED
|
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
torch>=2.0.0
|
| 2 |
+
torchvision>=0.15.0
|
| 3 |
+
Pillow>=9.0.0
|