Aso Nozomi Lora Flux NF4

Prompt
Training With QLoRA: Aso Nozomi is captured in a closeup portrait, her long brown hair in ponytail and bang framing her smiling face as she looks directly at the camera. Pink lipstick and dark eyeliner accentuate her features, partially cropped by the frame. The background is softly blurred, hinting at a warm yellow wall. A white plate with a red design sits to her left on a brown wooden surface. The image evokes a feeling of cozy intimacy, like a candid moment in a comfortable space.
Prompt
Training Without QLoRA: Aso Nozomi is captured in a closeup portrait, her long brown hair in ponytail and bang framing her smiling face as she looks directly at the camera. Pink lipstick and dark eyeliner accentuate her features, partially cropped by the frame. The background is softly blurred, hinting at a warm yellow wall. A white plate with a red design sits to her left on a brown wooden surface. The image evokes a feeling of cozy intimacy, like a candid moment in a comfortable space.
Prompt
Testing With QLoRA: Aso Nozomi in a fitting lace qipao with high slit showing left leg, wearing stiletto
Prompt
Testing Without QLoRA: Aso Nozomi in a fitting lace qipao with high slit showing left leg, wearing stiletto

麻生希 / あそうのぞみ / Aso Nozomi

All files are also archived in https://github.com/je-suis-tm/huggingface-archive in case this gets censored.

The QLoRA fine-tuning process of aso_nozomi_lora_flux_nf4 takes inspiration from this post (https://huggingface.co/blog/flux-qlora). The training was executed on a local computer with 1200 timesteps and the same parameters as the link mentioned above, which took around 8 hours on 8GB VRAM 4060. The peak VRAM usage was around 7.7GB. To avoid running low on VRAM, both transformers and text_encoder were quantized. The biggest challenge of training Japanese actresses is their photos used heavy filters to whiten and smoothen the skin. This practise severely distorts the training images which makes the result less convincing than Hollywood actresses. This training dataset contains a lot of face closeup which makes result more aligned with her actual face. The tradeoff is the overfitting problem of QLoRA which makes model more likely to ignore the prompt. All the images generated here are using the below parameters

  • Height: 512
  • Width: 512
  • Guidance scale: 5
  • Num inference steps: 20
  • Max sequence length: 512
  • Seed: 0

Usage

import torch
from diffusers import FluxPipeline, FluxTransformer2DModel
from transformers import T5EncoderModel

text_encoder_4bit = T5EncoderModel.from_pretrained(
    "hf-internal-testing/flux.1-dev-nf4-pkg", subfolder="text_encoder_2",torch_dtype=torch.float16,)

transformer_4bit = FluxTransformer2DModel.from_pretrained(
        "hf-internal-testing/flux.1-dev-nf4-pkg", subfolder="transformer",torch_dtype=torch.float16,)

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.float16,
                                    transformer=transformer_4bit,text_encoder_2=text_encoder_4bit)

pipe.load_lora_weights("je-suis-tm/aso_nozomi_lora_flux_nf4",
                       weight_name='pytorch_lora_weights.safetensors')

prompt="Aso Nozomi in a fitting lace qipao with high slit showing left leg, wearing stiletto"

image = pipe(
            prompt,
            height=512,
            width=512,
            guidance_scale=5,
            num_inference_steps=20,
            max_sequence_length=512,
            generator=torch.Generator("cpu").manual_seed(0),            
        ).images[0]

image.save("aso_nozomi_lora_flux_nf4.png")

Trigger words

You should use Aso Nozomi to trigger the image generation.

Download model

Download them in the Files & versions tab.

Downloads last month
1
Inference Providers NEW
Examples

Model tree for je-suis-tm/aso_nozomi_lora_flux_nf4

Adapter
(36804)
this model

Dataset used to train je-suis-tm/aso_nozomi_lora_flux_nf4

Collection including je-suis-tm/aso_nozomi_lora_flux_nf4