Text-to-Image
Diffusers
stable-diffusion-xl
lora
inference
rajkr commited on
Commit
1659bfb
Β·
verified Β·
1 Parent(s): 72cb658

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +105 -28
README.md CHANGED
@@ -6,31 +6,82 @@ tags:
6
  - stable-diffusion-xl
7
  - lora
8
  - diffusers
 
9
  license: openrail++
10
  datasets:
11
  - reach-vb/pokemon-blip-captions
12
  pipeline_tag: text-to-image
13
  ---
14
 
15
- # SDXL Pokemon LoRA - Text-to-Image Generator
16
 
17
- Fine-tuned SDXL LoRA adapters for generating Pokemon-style images from text prompts.
18
 
19
- ## πŸš€ Quick Start
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
  ```python
22
  from diffusers import AutoPipelineForText2Image
23
  import torch
24
 
25
- pipeline = AutoPipelineForText2Image.from_pretrained(
 
26
  "stabilityai/stable-diffusion-xl-base-1.0",
27
  torch_dtype=torch.float16,
 
28
  ).to("cuda")
29
 
30
- pipeline.load_lora_weights("rajkr/sdxl-pokemon-lora")
31
-
32
- image = pipeline(
33
- "a cute fire pokemon with blue flames, highly detailed, anime style",
34
  num_inference_steps=30,
35
  guidance_scale=7.5,
36
  ).images[0]
@@ -38,39 +89,65 @@ image = pipeline(
38
  image.save("pokemon.png")
39
  ```
40
 
41
- ## Training Details
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
42
 
43
  | Parameter | Value |
44
- |-----------|-------|
45
  | **Base Model** | `stabilityai/stable-diffusion-xl-base-1.0` |
46
  | **VAE** | `madebyollin/sdxl-vae-fp16-fix` |
47
  | **Dataset** | `reach-vb/pokemon-blip-captions` |
48
- | **LoRA Rank** | 16 |
49
- | **Learning Rate** | 1e-4 |
50
- | **Epochs** | 3 |
51
  | **Batch Size** | 2 Γ— 4 (gradient accumulation) |
52
  | **Resolution** | 1024Γ—1024 |
 
 
53
  | **Mixed Precision** | fp16 |
54
- | **Min-SNR Gamma** | 5.0 |
55
- | **Optimizer** | AdamW |
56
- | **LR Scheduler** | Cosine |
57
 
58
- ## Training Script
59
 
60
- The training script (`train_sdxl_lora.py`) is included in this repo. To train:
61
 
62
  ```bash
63
- pip install torch torchvision diffusers transformers accelerate peft datasets trackio xformers
64
- accelerate launch train_sdxl_lora.py
 
 
65
  ```
66
 
67
- ## Architecture
68
-
69
- This model uses LoRA (Low-Rank Adaptation) to efficiently fine-tune SDXL's attention layers:
70
- - **Target modules:** `to_k`, `to_q`, `to_v`, `to_out.0`
71
- - **LoRA rank:** 16 (good balance of capacity vs. parameter efficiency)
72
- - **Trainable parameters:** ~4.7M (vs ~2.6B full UNet)
73
 
74
- ## Demo
75
 
76
- Try the interactive demo: [rajkr/sdxl-image-generator](https://huggingface.co/spaces/rajkr/sdxl-image-generator)
 
 
 
6
  - stable-diffusion-xl
7
  - lora
8
  - diffusers
9
+ - inference
10
  license: openrail++
11
  datasets:
12
  - reach-vb/pokemon-blip-captions
13
  pipeline_tag: text-to-image
14
  ---
15
 
16
+ # 🎨 SDXL Text-to-Image Generator
17
 
18
+ Generate stunning images from text prompts β€” works everywhere: **local, Colab, Kaggle**.
19
 
20
+ ---
21
+
22
+ ## πŸš€ Quick Start β€” 3 Ways
23
+
24
+ ### 1️⃣ Local CLI (Like Ollama)
25
+
26
+ ```bash
27
+ git clone https://huggingface.co/rajkr/sdxl-pokemon-lora
28
+ cd sdxl-pokemon-lora
29
+ pip install -r clients/requirements.txt
30
+
31
+ # Download once (~7GB), generate forever β€” auto-cached to ~/.cache/huggingface
32
+ python clients/generate.py "a majestic dragon flying over a crystal lake"
33
+ python clients/generate.py "an astronaut riding a horse on Mars" --steps 50 --guidance 8.0 --seed 42
34
+ python clients/generate.py --list-models
35
+ ```
36
+
37
+ | GPU | Speed | Notes |
38
+ |---|---|---|
39
+ | RTX 4090 (24GB) | ~20s/image | Best |
40
+ | RTX 3090 (24GB) | ~25s/image | Great |
41
+ | Colab T4 (16GB) | ~60s/image | **Free** |
42
+ | Apple M-series | ~5min/image | Slow but works |
43
+ | CPU only | ~10min/image | Very slow |
44
+
45
+ ### 2️⃣ Google Colab (**FREE GPU**)
46
+
47
+ Download and open this notebook in Colab, then set GPU Runtime:
48
+
49
+ [πŸ“ Open Colab Notebook](https://huggingface.co/rajkr/sdxl-pokemon-lora/resolve/main/clients/colab_notebook.ipynb)
50
+
51
+ Steps:
52
+ 1. Download the notebook from above link
53
+ 2. Upload to [colab.research.google.com](https://colab.research.google.com)
54
+ 3. Set GPU: **Runtime β†’ Change runtime type β†’ T4 GPU**
55
+ 4. Run all cells β€” first run downloads ~7GB, then unlimited free generation
56
+
57
+ ### 3️⃣ Kaggle (**FREE GPU**)
58
+
59
+ [πŸ“ Open Kaggle Notebook](https://huggingface.co/rajkr/sdxl-pokemon-lora/resolve/main/clients/kaggle_notebook.ipynb)
60
+
61
+ Steps:
62
+ 1. Download the notebook
63
+ 2. Create new Kaggle notebook β†’ Upload
64
+ 3. Turn on GPU: **Settings β†’ Accelerator β†’ GPU T4**
65
+ 4. Run all cells
66
+
67
+ ---
68
+
69
+ ## πŸ› οΈ Advanced: Python SDK
70
 
71
  ```python
72
  from diffusers import AutoPipelineForText2Image
73
  import torch
74
 
75
+ # Downloads once (~7GB), then runs locally
76
+ pipe = AutoPipelineForText2Image.from_pretrained(
77
  "stabilityai/stable-diffusion-xl-base-1.0",
78
  torch_dtype=torch.float16,
79
+ variant="fp16",
80
  ).to("cuda")
81
 
82
+ # Generate
83
+ image = pipe(
84
+ "a cute fire pokemon with blue flames, anime style",
 
85
  num_inference_steps=30,
86
  guidance_scale=7.5,
87
  ).images[0]
 
89
  image.save("pokemon.png")
90
  ```
91
 
92
+ ---
93
+
94
+ ## πŸ“¦ Files in this Repo
95
+
96
+ | File | Description |
97
+ |---|---|
98
+ | `train_sdxl_lora.py` | Full training script β€” fine-tune SDXL with LoRA on any image+caption dataset |
99
+ | `clients/generate.py` | CLI tool β€” `python generate.py "prompt"` β€” works like Ollama |
100
+ | `clients/requirements.txt` | `pip install -r` this for local setup |
101
+ | `clients/colab_notebook.ipynb` | Google Colab notebook (free T4 GPU) |
102
+ | `clients/kaggle_notebook.ipynb` | Kaggle notebook (free T4 GPU) |
103
+
104
+ ---
105
+
106
+ ## 🧠 Training Your Own Model
107
+
108
+ Fine-tune SDXL with LoRA on your own dataset:
109
+
110
+ ```bash
111
+ pip install torch torchvision diffusers transformers accelerate peft datasets xformers
112
+ git clone https://huggingface.co/rajkr/sdxl-pokemon-lora
113
+ cd sdxl-pokemon-lora
114
+ accelerate launch train_sdxl_lora.py
115
+ ```
116
+
117
+ The LoRA weights will be pushed to: `rajkr/sdxl-pokemon-lora`
118
+
119
+ ### Training Specs
120
 
121
  | Parameter | Value |
122
+ |---|---|
123
  | **Base Model** | `stabilityai/stable-diffusion-xl-base-1.0` |
124
  | **VAE** | `madebyollin/sdxl-vae-fp16-fix` |
125
  | **Dataset** | `reach-vb/pokemon-blip-captions` |
126
+ | **Method** | LoRA (rank=16) |
127
+ | **Trainable Params** | ~4.7M (vs. ~2.6B full UNet) |
 
128
  | **Batch Size** | 2 Γ— 4 (gradient accumulation) |
129
  | **Resolution** | 1024Γ—1024 |
130
+ | **Epochs** | 3 |
131
+ | **LR** | 1e-4 (AdamW, cosine) |
132
  | **Mixed Precision** | fp16 |
133
+ | **VRAM** | ~20-24GB |
134
+ | **Hardware** | A100, A10G, RTX 4090 |
 
135
 
136
+ ---
137
 
138
+ ## πŸ–ΌοΈ Sample Prompts
139
 
140
  ```bash
141
+ python clients/generate.py "a majestic dragon flying over a crystal lake at sunset, epic fantasy art"
142
+ python clients/generate.py "an astronaut riding a horse on Mars, cinematic shot" --steps 50
143
+ python clients/generate.py "a cozy coffee shop interior with rain outside" --model stabilityai/stable-diffusion-2-1
144
+ python clients/generate.py "a futuristic city skyline at night with neon lights" --guidance 10
145
  ```
146
 
147
+ ---
 
 
 
 
 
148
 
149
+ ## πŸ”— Links
150
 
151
+ - **Model Repo:** https://huggingface.co/rajkr/sdxl-pokemon-lora
152
+ - **Training Script:** `train_sdxl_lora.py`
153
+ - **Community Forum:** https://huggingface.co/rajkr/sdxl-pokemon-lora/discussions