rajkr
/

sdxl-pokemon-lora

@@ -6,31 +6,82 @@ tags:
   - stable-diffusion-xl
   - lora
   - diffusers
 license: openrail++
 datasets:
   - reach-vb/pokemon-blip-captions
 pipeline_tag: text-to-image
 ---
-# SDXL Pokemon LoRA - Text-to-Image Generator
-Fine-tuned SDXL LoRA adapters for generating Pokemon-style images from text prompts.
-## 🚀 Quick Start
 ```python
 from diffusers import AutoPipelineForText2Image
 import torch
-pipeline = AutoPipelineForText2Image.from_pretrained(
     "stabilityai/stable-diffusion-xl-base-1.0",
     torch_dtype=torch.float16,
 ).to("cuda")
-pipeline.load_lora_weights("rajkr/sdxl-pokemon-lora")
-image = pipeline(
-    "a cute fire pokemon with blue flames, highly detailed, anime style",
     num_inference_steps=30,
     guidance_scale=7.5,
 ).images[0]
@@ -38,39 +89,65 @@ image = pipeline(
 image.save("pokemon.png")
 ```
-## Training Details
 | Parameter | Value |
-|-----------|-------|
 | **Base Model** | `stabilityai/stable-diffusion-xl-base-1.0` |
 | **VAE** | `madebyollin/sdxl-vae-fp16-fix` |
 | **Dataset** | `reach-vb/pokemon-blip-captions` |
-| **LoRA Rank** | 16 |
-| **Learning Rate** | 1e-4 |
-| **Epochs** | 3 |
 | **Batch Size** | 2 × 4 (gradient accumulation) |
 | **Resolution** | 1024×1024 |
 | **Mixed Precision** | fp16 |
-| **Min-SNR Gamma** | 5.0 |
-| **Optimizer** | AdamW |
-| **LR Scheduler** | Cosine |
-## Training Script
-The training script (`train_sdxl_lora.py`) is included in this repo. To train:
 ```bash
-pip install torch torchvision diffusers transformers accelerate peft datasets trackio xformers
-accelerate launch train_sdxl_lora.py
 ```
-## Architecture
-This model uses LoRA (Low-Rank Adaptation) to efficiently fine-tune SDXL's attention layers:
-- **Target modules:** `to_k`, `to_q`, `to_v`, `to_out.0`
-- **LoRA rank:** 16 (good balance of capacity vs. parameter efficiency)
-- **Trainable parameters:** ~4.7M (vs ~2.6B full UNet)
-## Demo
-Try the interactive demo: [rajkr/sdxl-image-generator](https://huggingface.co/spaces/rajkr/sdxl-image-generator)

   - stable-diffusion-xl
   - lora
   - diffusers
+  - inference
 license: openrail++
 datasets:
   - reach-vb/pokemon-blip-captions
 pipeline_tag: text-to-image
 ---
+# 🎨 SDXL Text-to-Image Generator
+Generate stunning images from text prompts — works everywhere: **local, Colab, Kaggle**.
+---
+## 🚀 Quick Start — 3 Ways
+### 1️⃣ Local CLI (Like Ollama)
+```bash
+git clone https://huggingface.co/rajkr/sdxl-pokemon-lora
+cd sdxl-pokemon-lora
+pip install -r clients/requirements.txt
+# Download once (~7GB), generate forever — auto-cached to ~/.cache/huggingface
+python clients/generate.py "a majestic dragon flying over a crystal lake"
+python clients/generate.py "an astronaut riding a horse on Mars" --steps 50 --guidance 8.0 --seed 42
+python clients/generate.py --list-models
+```
+| GPU | Speed | Notes |
+|---|---|---|
+| RTX 4090 (24GB) | ~20s/image | Best |
+| RTX 3090 (24GB) | ~25s/image | Great |
+| Colab T4 (16GB) | ~60s/image | **Free** |
+| Apple M-series | ~5min/image | Slow but works |
+| CPU only | ~10min/image | Very slow |
+### 2️⃣ Google Colab (**FREE GPU**)
+Download and open this notebook in Colab, then set GPU Runtime:
+[📁 Open Colab Notebook](https://huggingface.co/rajkr/sdxl-pokemon-lora/resolve/main/clients/colab_notebook.ipynb)
+Steps:
+1. Download the notebook from above link
+2. Upload to [colab.research.google.com](https://colab.research.google.com)
+3. Set GPU: **Runtime → Change runtime type → T4 GPU**
+4. Run all cells — first run downloads ~7GB, then unlimited free generation
+### 3️⃣ Kaggle (**FREE GPU**)
+[📁 Open Kaggle Notebook](https://huggingface.co/rajkr/sdxl-pokemon-lora/resolve/main/clients/kaggle_notebook.ipynb)
+Steps:
+1. Download the notebook
+2. Create new Kaggle notebook → Upload
+3. Turn on GPU: **Settings → Accelerator → GPU T4**
+4. Run all cells
+---
+## 🛠️ Advanced: Python SDK
 ```python
 from diffusers import AutoPipelineForText2Image
 import torch
+# Downloads once (~7GB), then runs locally
+pipe = AutoPipelineForText2Image.from_pretrained(
     "stabilityai/stable-diffusion-xl-base-1.0",
     torch_dtype=torch.float16,
+    variant="fp16",
 ).to("cuda")
+# Generate
+image = pipe(
+    "a cute fire pokemon with blue flames, anime style",
     num_inference_steps=30,
     guidance_scale=7.5,
 ).images[0]
 image.save("pokemon.png")
 ```
+---
+## 📦 Files in this Repo
+| File | Description |
+|---|---|
+| `train_sdxl_lora.py` | Full training script — fine-tune SDXL with LoRA on any image+caption dataset |
+| `clients/generate.py` | CLI tool — `python generate.py "prompt"` — works like Ollama |
+| `clients/requirements.txt` | `pip install -r` this for local setup |
+| `clients/colab_notebook.ipynb` | Google Colab notebook (free T4 GPU) |
+| `clients/kaggle_notebook.ipynb` | Kaggle notebook (free T4 GPU) |
+---
+## 🧠 Training Your Own Model
+Fine-tune SDXL with LoRA on your own dataset:
+```bash
+pip install torch torchvision diffusers transformers accelerate peft datasets xformers
+git clone https://huggingface.co/rajkr/sdxl-pokemon-lora
+cd sdxl-pokemon-lora
+accelerate launch train_sdxl_lora.py
+```
+The LoRA weights will be pushed to: `rajkr/sdxl-pokemon-lora`
+### Training Specs
 | Parameter | Value |
+|---|---|
 | **Base Model** | `stabilityai/stable-diffusion-xl-base-1.0` |
 | **VAE** | `madebyollin/sdxl-vae-fp16-fix` |
 | **Dataset** | `reach-vb/pokemon-blip-captions` |
+| **Method** | LoRA (rank=16) |
+| **Trainable Params** | ~4.7M (vs. ~2.6B full UNet) |
 | **Batch Size** | 2 × 4 (gradient accumulation) |
 | **Resolution** | 1024×1024 |
+| **Epochs** | 3 |
+| **LR** | 1e-4 (AdamW, cosine) |
 | **Mixed Precision** | fp16 |
+| **VRAM** | ~20-24GB |
+| **Hardware** | A100, A10G, RTX 4090 |
+---
+## 🖼️ Sample Prompts
 ```bash
+python clients/generate.py "a majestic dragon flying over a crystal lake at sunset, epic fantasy art"
+python clients/generate.py "an astronaut riding a horse on Mars, cinematic shot" --steps 50
+python clients/generate.py "a cozy coffee shop interior with rain outside" --model stabilityai/stable-diffusion-2-1
+python clients/generate.py "a futuristic city skyline at night with neon lights" --guidance 10
 ```
+---
+## 🔗 Links
+- **Model Repo:** https://huggingface.co/rajkr/sdxl-pokemon-lora
+- **Training Script:** `train_sdxl_lora.py`
+- **Community Forum:** https://huggingface.co/rajkr/sdxl-pokemon-lora/discussions