Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset
Paper • 2403.09029 • Published • 57
This model is designed to take screenshots of web pages as input and generate the corresponding HTML code using Tailwind CSS utilities. It is intended for developers looking to automate the conversion of UI designs or existing web pages into clean, functional code.
The model was fine-tuned on the HuggingFaceM4/WebSight (v0.2) dataset.
| Parameter | Value |
|---|---|
| LoRA Rank (r) | 64 |
| LoRA Alpha | 64 |
| Optimizer | AdamW (8-bit) |
| Learning Rate | 2e-4 |
| Batch Size | 1 (Per device) |
| Gradient Accumulation | 8 |
| Max Steps | 100 |
| Precision | 4-bit Quantization (NormalFloat4) |
Frameworks Used:
text-custom-500).max_seq_length limit.To use this model for inference, use the following code snippet:
from unsloth import FastLanguageModel
from transformers import AutoProcessor
from PIL import Image
model, tokenizer = FastLanguageModel.from_pretrained("saadxsalman/LFM-WebSight-Tailwind", load_in_4bit=True)
processor = AutoProcessor.from_pretrained("saadxsalman/LFM-WebSight-Tailwind")
# Inference logic goes here
If you use this model, please cite the original WebSight technical report:
@misc{laurençon2024unlocking,
title={Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset},
author={Hugo Laurençon and Léo Tronchon and Victor Sanh},
year={2024},
eprint={2403.09029},
archivePrefix={arXiv},
primaryClass={cs.HC}
}