Light Fantasy β FLUX.2 Klein Base 4B LoRA
A LoRA fine-tune of FLUX.2-klein-base-4B trained on 232 fantasy paintings. Produces luminous, painterly images with vibrant colors β castles, knights, dragons, enchanted landscapes, and magical atmospheres.
Trained on the undistilled base model for best fine-tuning quality with QLoRA (NF4 4-bit quantization). Runs on consumer GPUs with 16GB VRAM.
Quick Start
import torch
from diffusers import (
Flux2KleinPipeline,
Flux2Transformer2DModel,
BitsAndBytesConfig,
)
# Load transformer with 4-bit quantization (fits 16GB VRAM)
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
transformer = Flux2Transformer2DModel.from_pretrained(
"black-forest-labs/FLUX.2-klein-base-4B",
subfolder="transformer",
quantization_config=nf4_config,
torch_dtype=torch.bfloat16,
)
# Build pipeline
pipe = Flux2KleinPipeline.from_pretrained(
"black-forest-labs/FLUX.2-klein-base-4B",
transformer=transformer,
torch_dtype=torch.bfloat16,
)
pipe.load_lora_weights("giannisan/light-fantasy-flux2-klein-base-lora")
pipe.enable_model_cpu_offload()
# Fix VAE dtype mismatch (needed with CPU offload + bf16)
_orig_decode = pipe.vae._decode
def _patched_decode(z, *args, **kwargs):
return _orig_decode(z.to(pipe.vae.dtype), *args, **kwargs)
pipe.vae._decode = _patched_decode
# Generate!
image = pipe(
prompt="light_fantasy, a detailed fantasy painting of a dragon guarding a crystal cavern filled with gold",
height=512,
width=768,
num_inference_steps=50,
guidance_scale=3.5,
generator=torch.Generator("cpu").manual_seed(42),
).images[0]
image.save("output.png")
Usage Notes
Trigger Word
Use light_fantasy at the start of your prompt to activate the style:
light_fantasy, a detailed fantasy painting of [your scene description]
Without the trigger word, the base model's default style is used.
Inference Settings
| Parameter | Recommended | Notes |
|---|---|---|
num_inference_steps |
50 | Base model (undistilled) needs more steps. 30 is usable, 50 is best quality. |
guidance_scale |
3.5 | Standard for FLUX.2 base models. |
| Resolution | 512Γ512 or 512Γ768 | Trained at 512Γ512. Landscape (512Γ768) works great for scenes. 1024Γ1024 works but takes ~3.7 min. |
Generation Speed (RTX 4060 Ti 16GB)
| Resolution | Time |
|---|---|
| 512Γ512 | ~55 sec |
| 512Γ768 | ~1.5 min |
| 1024Γ1024 | ~3.7 min |
VRAM Requirements
With NF4 quantization + CPU offload, the model runs on 16GB VRAM GPUs. Without quantization, you'll need 24GB+.
Example Prompts
light_fantasy, a detailed fantasy painting of a massive dragon breathing fire on a castle while knights on horseback charge with lances and banners flying
light_fantasy, a detailed fantasy painting of a cozy wizard's library with floating books and a fireplace
light_fantasy, a detailed fantasy painting of an underwater kingdom with coral towers and merfolk
light_fantasy, a detailed fantasy painting of a knight in golden armor riding a silver dragon through the clouds above a medieval kingdom
Training Details
| Parameter | Value |
|---|---|
| Base model | FLUX.2-klein-base-4B (undistilled) |
| Method | DreamBooth LoRA with QLoRA (NF4 4-bit quantization) |
| Dataset | giannisan/light-fantasy-dataset β 232 images with per-image BLIP captions |
| Resolution | 512Γ512 |
| LoRA rank | 32 |
| LoRA alpha | 32 |
| Learning rate | 1e-4 (constant scheduler, 100 warmup steps) |
| Training steps | 2500 (~43 passes per image) |
| Batch size | 1 (gradient accumulation 4, effective batch 4) |
| Optimizer | AdamW 8-bit |
| Mixed precision | bf16 |
| Data augmentation | Random horizontal flip |
| Final loss | 0.939 |
| Hardware | NVIDIA RTX 4060 Ti 16GB |
| Training time | ~4 hours |
Why the base model?
We chose the undistilled base model over the step-distilled variant because LoRA fine-tuning disrupts step-distillation β a LoRA trained on the distilled model produces blurry images at 4 steps and requires 50 steps anyway. Training on the base model gives cleaner results since it's designed for multi-step inference.
Dataset
Trained on giannisan/light-fantasy-dataset β 232 fantasy paintings auto-captioned with BLIP (Salesforce/blip-image-captioning-large). Each caption starts with the light_fantasy trigger word followed by a scene description.
License
This LoRA inherits the license from the base model. See FLUX.2-klein-base-4B License.
- Downloads last month
- 19
Model tree for giannisan/light-fantasy-flux2-klein-base-lora
Base model
black-forest-labs/FLUX.2-klein-base-4B