HeavyAE (Heavy AutoEncoder)
HeavyAE is a high-resolution symmetric convolutional autoencoder optimized for reconstructing mobile interface screenshots at a native aspect ratio.
Model Details
- Model Type: Convolutional Autoencoder (Non-Variational)
- Parameters: ~12.6 Million
- Input Resolution: 1600 x 720 (RGB)
- Latent Bottleneck: 8 Channels
- Activation: LeakyReLU (Encoder/Decoder) | Sigmoid (Output)
Metadata
| Feature | Value |
|---|---|
| Layers | 10 (5 Encoder, 5 Decoder) |
| Target Size | 1600x720 |
| Latent Space | 8x45x100 |
| Format | PyTorch (model.pt) |
Benchmarks
The following results were obtained from internal testing samples (e.g., UI reconstruction tasks):
- Average Reconstruction Accuracy: 85.41%
- Average MSE: 0.0058
[Insert Benchmark] I will edit it after i have one, too lazy to do so.
Model Performance Benchmark: AE (10M Parameters)
Architecture: 1600×720 Autoencoder | Bottleneck: 96:1 | Optimization: Secret
| Test Case | Input Resolution | Aspect Ratio | Original Accuracy | Low-Noise Acc | High-Noise Acc | Primary Challenge |
|---|---|---|---|---|---|---|
| IRL Faces (Forest) | ~4k×3k (HQ) | 4:3 | 94.45% | 94.44% | 94.07% | Complex gradients & textures |
| AI Generated Art | 128×128 (LQ) | 1:1 | 96.68% | 96.61% | 95.85% | Upscaling/Interpolation noise |
| Digital Doodle | 720×720 (MD) | 1:1 | 95.91% | 95.90% | 95.71% | Sharp high-contrast edges |
Inference & Stress Test (Google Colab)
import torch
import torch.nn as nn
import numpy as np
import requests
from PIL import Image
from torchvision import transforms
import matplotlib.pyplot as plt
class HeavyAE(nn.Module):
def __init__(self):
super(HeavyAE, self).__init__()
self.encoder = nn.Sequential(
nn.Conv2d(3, 128, 3, stride=2, padding=1), nn.LeakyReLU(0.2),
nn.Conv2d(128, 256, 3, stride=2, padding=1), nn.LeakyReLU(0.2),
nn.Conv2d(256, 512, 3, stride=2, padding=1), nn.LeakyReLU(0.2),
nn.Conv2d(512, 1024, 3, stride=2, padding=1), nn.LeakyReLU(0.2),
nn.Conv2d(1024, 8, 3, stride=1, padding=1)
)
self.decoder = nn.Sequential(
nn.ConvTranspose2d(8, 1024, 3, stride=2, padding=1, output_padding=1), nn.LeakyReLU(0.2),
nn.ConvTranspose2d(1024, 512, 3, stride=2, padding=1, output_padding=1), nn.LeakyReLU(0.2),
nn.ConvTranspose2d(512, 256, 3, stride=2, padding=1, output_padding=1), nn.LeakyReLU(0.2),
nn.ConvTranspose2d(256, 128, 3, stride=2, padding=1, output_padding=1), nn.LeakyReLU(0.2),
nn.ConvTranspose2d(128, 3, 3, stride=1, padding=1), nn.Sigmoid()
)
def forward(self, x): return self.decoder(self.encoder(x))
# Setup
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = HeavyAE().to(device)
model_url = "https://huggingface.co/Parallax-labs-1/parallax_VISION-ValidPhone/resolve/main/model.pt"
# Download weights
response = requests.get(model_url)
with open("model.pt", "wb") as f:
f.write(response.content)
model.load_state_dict(torch.load("model.pt", map_location=device))
model.eval()
def test_model(img_path):
orig = Image.open(img_path).convert('RGB')
w, h = orig.size
preprocess = transforms.Compose([transforms.Resize((720, 1600)), transforms.ToTensor()])
input_t = preprocess(orig).unsqueeze(0).to(device)
with torch.no_grad():
recon = model(input_t)
# Stress Tests
noise_l = model(input_t + torch.randn_like(input_t) * 0.05)
noise_h = model(input_t + torch.randn_like(input_t) * 0.2)
# Metrics
def acc(a, b): return (1 - torch.mean(torch.abs(a - b)).item()) * 100
print(f"--- Log ---\nOriginal Accuracy: {acc(input_t, recon):.2f}%")
print(f"Low-Noise Accuracy: {acc(input_t, noise_l):.2f}%")
print(f"High-Noise Accuracy: {acc(input_t, noise_h):.2f}%")
# Output Images
res = transforms.ToPILImage()(recon.squeeze().cpu()).resize((w, h))
diff = np.abs(np.array(orig).astype(float) - np.array(res).astype(float)).astype(np.uint8)
fig, ax = plt.subplots(1, 3, figsize=(18, 6))
ax[0].imshow(orig); ax[0].set_title("Input")
ax[1].imshow(res); ax[1].set_title("Reconstruction")
ax[2].imshow(diff); ax[2].set_title("Error Map")
for a in ax: a.axis('off')
plt.show()
- Downloads last month
- 54
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support