Experimental Training on NoobAI Flux2VAE Rectified Flow v0.3 U-REPA to use different CLIP models
Trained on-top of NoobAI-Flux2VAE-RectifiedFlow-0.3 U-REPA
Why?
- Part of the reason I trained NoobAI Flux2VAE RF with U-REPA in the first place was to see if using U-REPA as an auxiliary loss objective would speed up convergence when making architecture changes (like changing text-encoder).
- I remembered this reddit post by Anzhc talking about how the text encoders in Noobai are flawed and that they finetuned CLIP-L and Bluvoll finetuned CLIP-G to fix them. I don't remember hearing about anyone doing the required training to use the models, so I decided to.
- To learn more.
Update:
- Did a bit more training, files are in ./V2
Main One is NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Experimental-new-CLIP-V2.safetensors
NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Experimental-new-CLIP_V2-SMA-Merge.safetensors in ./V2 is a average of the weights between the last 50% of that training run.
Note: Don't use the V1 LoKr on the V2 model.
Relevant Details for Usage
Same as NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA
Use the LoKR NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Experimental-new-CLIP-Booru-LOKR-000001.safetensors
Make sure to add , to the start and end of the prompt(s), or the first and last word might will get ignored.
If you are getting duplicates at larger resolutions, add solo to the prompt.
Quality Tags
Positive:
masterpiece, best quality, good qualityDue to my own mistakes, these don't have the biggest effect
Negative:
worst quality, low quality, bad anatomy,
- I also added
white glow outline, which might help remove the white "pixel glow" around characters.*Both
artistic errorandbad handscan help too
Additional Tagging Information:
- Artists: Tagged with the prefix
by, e.g.,by someArtistHere.Accidentally tagged some artists with the prefix
art by - Date Tags:
newestnewoldoldest<--- Does have a noticeable effect
- Content Rating Tags:
Rating: explicitRating: questionableRating: sensitiveRating: general- Note: You don't have to use them
- Danbooru Pools & Specific Concepts:
| Pool | Tagged As |
|---|---|
| Expert Shading | expert shading |
| Badass | badass |
| Action Shots | action |
| Epic | epic |
| To the death | duel to the death |
| I am King | "I am King" |
| Impending Doom | impending doom |
| Deep Thought | thought-provoking |
| Woman in practical armor | (N/A - Implicit) |
| Nightmare Fuel | nightmare fuel |
| Minorities | (N/A - Implicit) |
| Serious Beauty | serious beauty |
| Handsome Ladies | handsome lady |
| Danbooru Tag | Tagged As |
|---|---|
| Dynamic Pose | dynamic pose |
| Slice of Life | slice of life |
Training Details
Hardware: 1xA6000 48GB
Vision Encoder (for REPA): dinov3-vitl16-pretrain-lvd1689m
Dataset Details
From deepghs/danbooru2024:
0306-0313.tar.- Images from Pools, specific tags, and some from Pixiv.
- Note: I had to keep restarting the training due to a memory leak. Which I now know (after too much testing) is due to the compute provider. So expect concepts to be undertrained and the white glow to appear
LoKR Dataset
From deepghs/danbooru2024:
0000-0010.tar
Training Stages
1. Phase 1
- Base: NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Base
- Total Steps: ??? (Unbatched)
- Model File:
NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Base-newClip-????.safetensors - Learning Rate:
1e-4
Froze all layers expect from the key and value projections in the cross-attention layers and un-froze the REPA projector.
2. Phase 2?
- Base: Phase 1
- Steps: +? (Unbatched)
- Model File:
NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-Experimental-new-CLIP.safetensors - Settings Changed:
- Un-froze all the layers
--learning_rate:3e-5(from1e-4)--repa_lambda:0.20(from0.50)
3. LoKR
- Base: Phase 2
- Steps: +~100,000 (Unbatched)
- Model File:
? - Main Changes:
- No REPA
[Network_setup]
network_dim = 100000
network_alpha = 1
network_dropout = 0
network_train_unet_only = true
resume = false
[LyCORIS]
network_module = "lycoris.kohya"
network_args = [ "preset=full", "algo=lokr", "factor=4", ]
[optimizer_arguments]
lr_scheduler = "cosine"
optimizer_type = "AdamW8bit"
optimizer_args = ["weight_decay=0.01", "eps=1e-8", "betas=0.9,0.999"]
min_lr = 0
[training_arguments]
unet_lr = 1e-4
text_encoder_lr = 0
max_grad_norm = 1.0
lr_warmup_steps = 30
Run History & Configuration
Initial Configuration (Run 1)
Trained for ? Unbatched steps in this first run.
Key Parameters:
--manifold_weight 3.0: Controls the weighting of the manifold loss to the cosine loss in the overall REPA loss (as set in the U-REPA paper).--repa_lambda 0.50: Controls the weighting of the REPA loss to the regular L2 loss.loss = loss + (repa_lambda * total_repa_loss).
Training Code
- Repo: Bluvoll's Fork of sd-scripts
- + additional changes (that I'll eventually release).
Support
Support CabalResearch Here, the creators of the NoobAI Flux2VAE RectifiedFlow model and also the CLIP-L and CLIP-G used in this.
Thanks
- Downloads last month
- 57
Model tree for TheRemixer/NoobAI-Flux2VAE-RectifiedFlow-0.3-U-REPA-New-CLIP
Base model
Laxhar/noobai-XL-Vpred-0.75