dobby-model / README_QUANTIZATION.md
dobby-canvas's picture
Add files using upload-large-folder tool
36bfac7 verified

FLOAT8 Quantized Model (torchao)

์ด ๋ชจ๋ธ์€ FP16 ์›๋ณธ์ด๋ฉฐ, ์‚ฌ์šฉ ์‹œ torchao๋กœ ์–‘์žํ™”๋ฅผ ์ ์šฉํ•˜์„ธ์š”.

์›๋ณธ ๋ชจ๋ธ

  • Model ID: frankjoshua/novaAnimeXL_ilV140
  • Source: Hugging Face Hub

์–‘์žํ™” ์ •๋ณด

  • Quantization Type: FLOAT8 (torchao)
  • Method: Weight-only quantization
  • Components: ์ „์ฒด ํŒŒ์ดํ”„๋ผ์ธ
  • Storage Format: safetensors (FP16 ์›๋ณธ)

์‚ฌ์šฉ ๋ฐฉ๋ฒ•

from diffusers import StableDiffusionXLPipeline
from torchao.quantization import quantize_, float8_weight_only
import torch

# ๋ชจ๋ธ ๋กœ๋“œ
pipe = StableDiffusionXLPipeline.from_pretrained(
    "data_fp8/novaAnimeXL_fp8",
    torch_dtype=torch.float16,
)

# GPU๋กœ ์ด๋™
pipe = pipe.to("cuda")

# FLOAT8 ์–‘์žํ™” ์ ์šฉ (๋Ÿฐํƒ€์ž„)
quantize_(pipe.unet, float8_weight_only())
quantize_(pipe.vae, float8_weight_only())

# ์ด๋ฏธ์ง€ ์ƒ์„ฑ
prompt = "a beautiful landscape"
image = pipe(prompt).images[0]
image.save("output.png")

์ฃผ์˜์‚ฌํ•ญ

  • torchao ์–‘์žํ™”๋Š” ๋ฉ”๋ชจ๋ฆฌ์ƒ์—์„œ๋งŒ ์ ์šฉ๋ฉ๋‹ˆ๋‹ค
  • ๋ชจ๋ธ ํŒŒ์ผ ์ž์ฒด๋Š” FP16 ์›๋ณธ์ž…๋‹ˆ๋‹ค (safetensors)
  • ๋งค๋ฒˆ ๋กœ๋”ฉํ•  ๋•Œ๋งˆ๋‹ค ์–‘์žํ™”๋ฅผ ๋‹ค์‹œ ์ ์šฉํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค
  • ์ด๋ ‡๊ฒŒ ํ•ด๋„ ๋ฉ”๋ชจ๋ฆฌ์™€ ์†๋„ ์ด์ ์€ ๋™์ผํ•ฉ๋‹ˆ๋‹ค
  • ์›๋ณธ ๋ชจ๋ธ ๋Œ€๋น„ ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๊ฐ์†Œํ•˜๊ณ  ์ถ”๋ก  ์†๋„๊ฐ€ ํ–ฅ์ƒ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค

๋ณ€ํ™˜ ๋„๊ตฌ

  • Script: convert_hf_to_fp8_torchao.py
  • Library: torchao