Add inference script, model, and project setup

Browse files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Files changed (6) hide show

.gitattributes +1 -0
.gitignore +9 -0
README.md +112 -3
inference.py +164 -0
model.ts +3 -0
requirements.txt +3 -0

.gitattributes CHANGED Viewed

@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text

 *.zip filter=lfs diff=lfs merge=lfs -text
 *.zst filter=lfs diff=lfs merge=lfs -text
 *tfevents* filter=lfs diff=lfs merge=lfs -text
+model.ts filter=lfs diff=lfs merge=lfs -text

.gitignore ADDED Viewed

	@@ -0,0 +1,9 @@

+# Test outputs
+tests/
+*.webp
+*.jpg
+*.png
+# Environment
+.env
+.env.local

README.md CHANGED Viewed

@@ -1,3 +1,112 @@
----
-license: mit
----

+# Fast Watermark Removal
+A high-performance TorchScript model for removing watermarks from images. This model uses a dual-stage architecture optimized for speed and quality.
+## Features
+- **Fast inference**: ~500ms per image (RTX 4090)
+- **High quality**: Preserves image details while effectively removing watermarks
+- **Production-ready**: Compiled TorchScript model, no training code needed
+- **Memory efficient**: Requires 11.5GB VRAM
+## Limitations
+- **Output resolution**: Limited to 768px maximum dimension (aspect ratio preserved)
+## Commercial License
+A commercial license with **1536px maximum output resolution** is available for production use. The 1536px model maintains identical:
+- VRAM requirements (11.5GB)
+- Inference times (~500ms)
+- Image Output
+**Contact**: contact by email for commercial licensing inquiries
+## Installation
+### Requirements
+- Python 3.10+
+- CUDA-capable GPU with 11.5GB+ VRAM
+- PyTorch 2.0+
+### Setup
+```bash
+# Clone the repository
+git clone https://huggingface.co/[your-username]/remove-watermarks-fast
+cd remove-watermarks-fast
+# Install dependencies
+pip install -r requirements.txt
+```
+## Usage
+### Single Image
+```bash
+python inference.py -i /path/to/watermarked/image.jpg -m model.ts -o output_folder
+```
+### Batch Processing
+```bash
+python inference.py -f /path/to/images/folder -m model.ts -o output_folder
+```
+### Arguments
+- `-i, --image`: Path to single input watermarked image
+- `-f, --folder`: Path to folder containing watermarked images (processes all .jpg and .webp files)
+- `-m, --model_path`: Path to TorchScript model file (required)
+- `-o, --output_folder`: Output folder for results (default: `tests`)
+### Output
+The script saves two files per input:
+1. **Original image**: Copied to output folder with original filename
+2. **Clean image**: Saved as WebP with `-clean.webp` suffix
+Images are automatically resized to maintain aspect ratio while respecting the 768px maximum dimension.
+## How It Works
+The model uses a two-stage pipeline:
+1. **Stage 1**: Removes 90-95% of watermarks
+2. **Stage 2**: Removes remaining watermarks
+3. **Post-processing**: Automatic resizing to original aspect ratio (capped at 768px)
+All processing (including resizing and normalization) is performed within the compiled TorchScript model for optimal performance.
+## Performance
+- **GPU**: NVIDIA RTX 3090 / A6000 or equivalent
+- **VRAM**: 11.5GB required
+- **Speed**: ~500ms per image (768px output)
+- **Batch size**: 1 (optimized for low latency)
+## Future Improvements
+I'm actively exploring ways to enhance this model's capabilities. If you have suggestions, encounter issues, or are interested in collaborating on improvements, please reach out!
+## Technical Details
+- **Architecture**: Dual-stage with Swin2 Transformers
+- **Format**: TorchScript (.ts) compiled model
+- **Input**: RGB images (any resolution)
+- **Output**: RGB images (max 768px, aspect ratio preserved)
+- **Precision**: FP32 with TensorFloat32 matmul on Ampere+ GPUs
+## License
+This model is provided for **non-commercial research and personal use only**. For commercial applications, please contact by email for licensing options.
+## Support
+- **Issues**: Open an issue on the HuggingFace repository
+- **Questions**: jason@engageify.com
+- **Commercial licensing**: jason@engageify.com

inference.py ADDED Viewed

	@@ -0,0 +1,164 @@

+import argparse
+import os
+import sys
+import glob
+import time
+from pathlib import Path
+from PIL import Image
+import torch
+import torchvision.transforms as T
+# Output resolution is capped at 768px
+def parse_args():
+    parser = argparse.ArgumentParser(description="TorchScript Pipeline Inference for Watermark Removal")
+    group = parser.add_mutually_exclusive_group(required=True)
+    group.add_argument('-i', '--image', type=str, help="Path to single input watermarked image")
+    group.add_argument('-f', '--folder', type=str, help="Path to folder containing watermarked images")
+    parser.add_argument('-o', '--output_folder', type=str, default='tests', help="Output folder to save original and clean images")
+    parser.add_argument('-m', '--model_path', type=str, default='model.ts', help="Path to TorchScript pipeline model (.ts file)")
+    return parser.parse_args()
+def calculate_output_dimensions(orig_width, orig_height, max_size):
+    """
+    Calculate output dimensions maintaining original aspect ratio.
+    Caps at max_size (never upscale beyond processing size).
+    """
+    # If image fits within max_size, keep original dimensions
+    if orig_width <= max_size and orig_height <= max_size:
+        return (orig_width, orig_height)
+    # Scale down to fit within max_size, maintaining aspect ratio
+    if orig_width >= orig_height:
+        output_width = max_size
+        output_height = int(orig_height * (max_size / orig_width))
+    else:
+        output_height = max_size
+        output_width = int(orig_width * (max_size / orig_height))
+    return (output_width, output_height)
+def load_torchscript_model(model_path):
+    """Load TorchScript pipeline model."""
+    device = torch.device('cuda')
+    print(f"Loading TorchScript pipeline from: {model_path}")
+    model = torch.jit.load(model_path, map_location=device)
+    model.eval()
+    return model, device
+def process_image(img_path, model, device, output_folder=None):
+    # Load image and get original size
+    img = Image.open(img_path).convert('RGB')
+    orig_width, orig_height = img.size
+    base_name = os.path.basename(img_path)
+    print(f"  [{base_name}] Original: {orig_width}x{orig_height}", end="")
+    # Convert to tensor [1, 3, H, W] in [0, 1] range
+    img_tensor = T.ToTensor()(img).unsqueeze(0).to(device)
+    # Inference with TorchScript pipeline
+    # Pipeline handles: resize → normalize → model1 → model2 → denormalize → final resize
+    with torch.no_grad():
+        pred_t = model(img_tensor)  # Output: [1, 3, final_size, final_size] in [0, 1]
+    # Get output size from pipeline
+    _, _, pipeline_size, _ = pred_t.shape
+    print(f" → Pipeline output: {pipeline_size}x{pipeline_size}", end="")
+    # Convert tensor to PIL (square output at pipeline_size)
+    pred_img = T.ToPILImage()(pred_t.squeeze(0).cpu())
+    # Resize back to original dimensions using PIL LANCZOS (capped at pipeline_size)
+    output_width, output_height = calculate_output_dimensions(orig_width, orig_height, pipeline_size)
+    pred_img = pred_img.resize((output_width, output_height), resample=Image.LANCZOS)
+    print(f" → Resized: {output_width}x{output_height}", end="")
+    output_width, output_height = pred_img.size
+    print(f" → Output: {output_width}x{output_height}")
+    # Determine save paths
+    base_name = os.path.splitext(os.path.basename(img_path))[0]
+    clean_name = f"{base_name}-clean.webp"
+    # Create output folder and save both original and clean versions
+    os.makedirs(output_folder, exist_ok=True)
+    # Save original in output folder (keeps original extension)
+    orig_save_path = os.path.join(output_folder, os.path.basename(img_path))
+    img.save(orig_save_path)
+    # Save clean version (webp format with -clean suffix)
+    clean_path = os.path.join(output_folder, clean_name)
+    pred_img.save(clean_path, 'WEBP', quality=95)
+def main():
+    # Enable TensorFloat32 for faster matmul on Ampere+ GPUs
+    torch.set_float32_matmul_precision('high')
+    args = parse_args()
+    # Verify TorchScript model exists
+    if not os.path.exists(args.model_path):
+        print(f"Error: TorchScript model not found: {args.model_path}")
+        return
+    print(f"TorchScript Pipeline Inference")
+    print(f"Model: {args.model_path}")
+    print()
+    # Load TorchScript pipeline once
+    model, device = load_torchscript_model(args.model_path)
+    print(f"Pipeline loaded on {device}")
+    print()
+    num_images = 0
+    # Determine output folder based on processing mode
+    if args.image:
+        # Single image: save directly in output_folder
+        output_path = args.output_folder
+        # Start timing AFTER model loading
+        start_time = time.time()
+        process_image(args.image, model, device, output_path)
+        num_images = 1
+    elif args.folder:
+        # Folder processing: create subfolder {model_name}_{folder_name}_ts
+        model_name = os.path.splitext(os.path.basename(args.model_path))[0]
+        folder_name = os.path.basename(os.path.normpath(args.folder))
+        subfolder_name = f"{model_name}_{folder_name}_ts"
+        output_path = os.path.join(args.output_folder, subfolder_name)
+        print(f"Saving outputs to: {output_path}")
+        print()
+        # Process all JPG/WebP in folder
+        patterns = ['*.jpg', '*.webp']
+        images = []
+        for pattern in patterns:
+            images.extend(glob.glob(os.path.join(args.folder, pattern)))
+        num_images = len(images)
+        # Start timing AFTER model loading
+        start_time = time.time()
+        for img_path in sorted(images):
+            process_image(img_path, model, device, output_path)
+    # Print total processing time
+    elapsed_time = time.time() - start_time
+    print(f"\nProcessed {num_images} image{'s' if num_images != 1 else ''} in {elapsed_time:.2f} seconds ({elapsed_time/num_images:.2f}s per image)")
+if __name__ == '__main__':
+    main()

model.ts ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:fd2383e89c2e074035e9b87dd3abc9735a694fe8391d3423715e639200960d41
+size 2144360989

requirements.txt ADDED Viewed

	@@ -0,0 +1,3 @@

+torch>=2.0.0
+torchvision>=0.15.0
+Pillow>=9.0.0