jasonengage Claude Opus 4.6 (1M context) commited on
Commit
b5e53f5
·
1 Parent(s): e1b6f74

Add inference script, model, and project setup

Browse files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Files changed (6) hide show
  1. .gitattributes +1 -0
  2. .gitignore +9 -0
  3. README.md +112 -3
  4. inference.py +164 -0
  5. model.ts +3 -0
  6. requirements.txt +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ model.ts filter=lfs diff=lfs merge=lfs -text
.gitignore ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ # Test outputs
2
+ tests/
3
+ *.webp
4
+ *.jpg
5
+ *.png
6
+
7
+ # Environment
8
+ .env
9
+ .env.local
README.md CHANGED
@@ -1,3 +1,112 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fast Watermark Removal
2
+
3
+ A high-performance TorchScript model for removing watermarks from images. This model uses a dual-stage architecture optimized for speed and quality.
4
+
5
+ ## Features
6
+
7
+ - **Fast inference**: ~500ms per image (RTX 4090)
8
+ - **High quality**: Preserves image details while effectively removing watermarks
9
+ - **Production-ready**: Compiled TorchScript model, no training code needed
10
+ - **Memory efficient**: Requires 11.5GB VRAM
11
+
12
+ ## Limitations
13
+
14
+ - **Output resolution**: Limited to 768px maximum dimension (aspect ratio preserved)
15
+
16
+ ## Commercial License
17
+
18
+ A commercial license with **1536px maximum output resolution** is available for production use. The 1536px model maintains identical:
19
+
20
+ - VRAM requirements (11.5GB)
21
+ - Inference times (~500ms)
22
+ - Image Output
23
+
24
+ **Contact**: contact by email for commercial licensing inquiries
25
+
26
+ ## Installation
27
+
28
+ ### Requirements
29
+
30
+ - Python 3.10+
31
+ - CUDA-capable GPU with 11.5GB+ VRAM
32
+ - PyTorch 2.0+
33
+
34
+ ### Setup
35
+
36
+ ```bash
37
+ # Clone the repository
38
+ git clone https://huggingface.co/[your-username]/remove-watermarks-fast
39
+ cd remove-watermarks-fast
40
+
41
+ # Install dependencies
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ ## Usage
46
+
47
+ ### Single Image
48
+
49
+ ```bash
50
+ python inference.py -i /path/to/watermarked/image.jpg -m model.ts -o output_folder
51
+ ```
52
+
53
+ ### Batch Processing
54
+
55
+ ```bash
56
+ python inference.py -f /path/to/images/folder -m model.ts -o output_folder
57
+ ```
58
+
59
+ ### Arguments
60
+
61
+ - `-i, --image`: Path to single input watermarked image
62
+ - `-f, --folder`: Path to folder containing watermarked images (processes all .jpg and .webp files)
63
+ - `-m, --model_path`: Path to TorchScript model file (required)
64
+ - `-o, --output_folder`: Output folder for results (default: `tests`)
65
+
66
+ ### Output
67
+
68
+ The script saves two files per input:
69
+
70
+ 1. **Original image**: Copied to output folder with original filename
71
+ 2. **Clean image**: Saved as WebP with `-clean.webp` suffix
72
+
73
+ Images are automatically resized to maintain aspect ratio while respecting the 768px maximum dimension.
74
+
75
+ ## How It Works
76
+
77
+ The model uses a two-stage pipeline:
78
+
79
+ 1. **Stage 1**: Removes 90-95% of watermarks
80
+ 2. **Stage 2**: Removes remaining watermarks
81
+ 3. **Post-processing**: Automatic resizing to original aspect ratio (capped at 768px)
82
+
83
+ All processing (including resizing and normalization) is performed within the compiled TorchScript model for optimal performance.
84
+
85
+ ## Performance
86
+
87
+ - **GPU**: NVIDIA RTX 3090 / A6000 or equivalent
88
+ - **VRAM**: 11.5GB required
89
+ - **Speed**: ~500ms per image (768px output)
90
+ - **Batch size**: 1 (optimized for low latency)
91
+
92
+ ## Future Improvements
93
+
94
+ I'm actively exploring ways to enhance this model's capabilities. If you have suggestions, encounter issues, or are interested in collaborating on improvements, please reach out!
95
+
96
+ ## Technical Details
97
+
98
+ - **Architecture**: Dual-stage with Swin2 Transformers
99
+ - **Format**: TorchScript (.ts) compiled model
100
+ - **Input**: RGB images (any resolution)
101
+ - **Output**: RGB images (max 768px, aspect ratio preserved)
102
+ - **Precision**: FP32 with TensorFloat32 matmul on Ampere+ GPUs
103
+
104
+ ## License
105
+
106
+ This model is provided for **non-commercial research and personal use only**. For commercial applications, please contact by email for licensing options.
107
+
108
+ ## Support
109
+
110
+ - **Issues**: Open an issue on the HuggingFace repository
111
+ - **Questions**: jason@engageify.com
112
+ - **Commercial licensing**: jason@engageify.com
inference.py ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import os
3
+ import sys
4
+ import glob
5
+ import time
6
+ from pathlib import Path
7
+ from PIL import Image
8
+ import torch
9
+ import torchvision.transforms as T
10
+
11
+ # Output resolution is capped at 768px
12
+
13
+
14
+ def parse_args():
15
+ parser = argparse.ArgumentParser(description="TorchScript Pipeline Inference for Watermark Removal")
16
+ group = parser.add_mutually_exclusive_group(required=True)
17
+ group.add_argument('-i', '--image', type=str, help="Path to single input watermarked image")
18
+ group.add_argument('-f', '--folder', type=str, help="Path to folder containing watermarked images")
19
+ parser.add_argument('-o', '--output_folder', type=str, default='tests', help="Output folder to save original and clean images")
20
+ parser.add_argument('-m', '--model_path', type=str, default='model.ts', help="Path to TorchScript pipeline model (.ts file)")
21
+ return parser.parse_args()
22
+
23
+
24
+ def calculate_output_dimensions(orig_width, orig_height, max_size):
25
+ """
26
+ Calculate output dimensions maintaining original aspect ratio.
27
+ Caps at max_size (never upscale beyond processing size).
28
+ """
29
+ # If image fits within max_size, keep original dimensions
30
+ if orig_width <= max_size and orig_height <= max_size:
31
+ return (orig_width, orig_height)
32
+
33
+ # Scale down to fit within max_size, maintaining aspect ratio
34
+ if orig_width >= orig_height:
35
+ output_width = max_size
36
+ output_height = int(orig_height * (max_size / orig_width))
37
+ else:
38
+ output_height = max_size
39
+ output_width = int(orig_width * (max_size / orig_height))
40
+
41
+ return (output_width, output_height)
42
+
43
+
44
+ def load_torchscript_model(model_path):
45
+ """Load TorchScript pipeline model."""
46
+ device = torch.device('cuda')
47
+
48
+ print(f"Loading TorchScript pipeline from: {model_path}")
49
+ model = torch.jit.load(model_path, map_location=device)
50
+ model.eval()
51
+
52
+ return model, device
53
+
54
+
55
+ def process_image(img_path, model, device, output_folder=None):
56
+ # Load image and get original size
57
+ img = Image.open(img_path).convert('RGB')
58
+ orig_width, orig_height = img.size
59
+
60
+ base_name = os.path.basename(img_path)
61
+ print(f" [{base_name}] Original: {orig_width}x{orig_height}", end="")
62
+
63
+ # Convert to tensor [1, 3, H, W] in [0, 1] range
64
+ img_tensor = T.ToTensor()(img).unsqueeze(0).to(device)
65
+
66
+ # Inference with TorchScript pipeline
67
+ # Pipeline handles: resize → normalize → model1 → model2 → denormalize → final resize
68
+ with torch.no_grad():
69
+ pred_t = model(img_tensor) # Output: [1, 3, final_size, final_size] in [0, 1]
70
+
71
+ # Get output size from pipeline
72
+ _, _, pipeline_size, _ = pred_t.shape
73
+ print(f" → Pipeline output: {pipeline_size}x{pipeline_size}", end="")
74
+
75
+ # Convert tensor to PIL (square output at pipeline_size)
76
+ pred_img = T.ToPILImage()(pred_t.squeeze(0).cpu())
77
+
78
+ # Resize back to original dimensions using PIL LANCZOS (capped at pipeline_size)
79
+ output_width, output_height = calculate_output_dimensions(orig_width, orig_height, pipeline_size)
80
+ pred_img = pred_img.resize((output_width, output_height), resample=Image.LANCZOS)
81
+ print(f" → Resized: {output_width}x{output_height}", end="")
82
+
83
+ output_width, output_height = pred_img.size
84
+ print(f" → Output: {output_width}x{output_height}")
85
+
86
+ # Determine save paths
87
+ base_name = os.path.splitext(os.path.basename(img_path))[0]
88
+ clean_name = f"{base_name}-clean.webp"
89
+
90
+ # Create output folder and save both original and clean versions
91
+ os.makedirs(output_folder, exist_ok=True)
92
+
93
+ # Save original in output folder (keeps original extension)
94
+ orig_save_path = os.path.join(output_folder, os.path.basename(img_path))
95
+ img.save(orig_save_path)
96
+
97
+ # Save clean version (webp format with -clean suffix)
98
+ clean_path = os.path.join(output_folder, clean_name)
99
+ pred_img.save(clean_path, 'WEBP', quality=95)
100
+
101
+
102
+ def main():
103
+ # Enable TensorFloat32 for faster matmul on Ampere+ GPUs
104
+ torch.set_float32_matmul_precision('high')
105
+
106
+ args = parse_args()
107
+
108
+ # Verify TorchScript model exists
109
+ if not os.path.exists(args.model_path):
110
+ print(f"Error: TorchScript model not found: {args.model_path}")
111
+ return
112
+
113
+ print(f"TorchScript Pipeline Inference")
114
+ print(f"Model: {args.model_path}")
115
+ print()
116
+
117
+ # Load TorchScript pipeline once
118
+ model, device = load_torchscript_model(args.model_path)
119
+ print(f"Pipeline loaded on {device}")
120
+ print()
121
+
122
+ num_images = 0
123
+
124
+ # Determine output folder based on processing mode
125
+ if args.image:
126
+ # Single image: save directly in output_folder
127
+ output_path = args.output_folder
128
+
129
+ # Start timing AFTER model loading
130
+ start_time = time.time()
131
+
132
+ process_image(args.image, model, device, output_path)
133
+ num_images = 1
134
+ elif args.folder:
135
+ # Folder processing: create subfolder {model_name}_{folder_name}_ts
136
+ model_name = os.path.splitext(os.path.basename(args.model_path))[0]
137
+ folder_name = os.path.basename(os.path.normpath(args.folder))
138
+ subfolder_name = f"{model_name}_{folder_name}_ts"
139
+ output_path = os.path.join(args.output_folder, subfolder_name)
140
+
141
+ print(f"Saving outputs to: {output_path}")
142
+ print()
143
+
144
+ # Process all JPG/WebP in folder
145
+ patterns = ['*.jpg', '*.webp']
146
+ images = []
147
+ for pattern in patterns:
148
+ images.extend(glob.glob(os.path.join(args.folder, pattern)))
149
+
150
+ num_images = len(images)
151
+
152
+ # Start timing AFTER model loading
153
+ start_time = time.time()
154
+
155
+ for img_path in sorted(images):
156
+ process_image(img_path, model, device, output_path)
157
+
158
+ # Print total processing time
159
+ elapsed_time = time.time() - start_time
160
+ print(f"\nProcessed {num_images} image{'s' if num_images != 1 else ''} in {elapsed_time:.2f} seconds ({elapsed_time/num_images:.2f}s per image)")
161
+
162
+
163
+ if __name__ == '__main__':
164
+ main()
model.ts ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fd2383e89c2e074035e9b87dd3abc9735a694fe8391d3423715e639200960d41
3
+ size 2144360989
requirements.txt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ torch>=2.0.0
2
+ torchvision>=0.15.0
3
+ Pillow>=9.0.0