FLOAT8 Quantized Model (torchao)
์ด ๋ชจ๋ธ์ FP16 ์๋ณธ์ด๋ฉฐ, ์ฌ์ฉ ์ torchao๋ก ์์ํ๋ฅผ ์ ์ฉํ์ธ์.
์๋ณธ ๋ชจ๋ธ
- Model ID:
frankjoshua/novaAnimeXL_ilV140 - Source: Hugging Face Hub
์์ํ ์ ๋ณด
- Quantization Type: FLOAT8 (torchao)
- Method: Weight-only quantization
- Components: ์ ์ฒด ํ์ดํ๋ผ์ธ
- Storage Format: safetensors (FP16 ์๋ณธ)
์ฌ์ฉ ๋ฐฉ๋ฒ
from diffusers import StableDiffusionXLPipeline
from torchao.quantization import quantize_, float8_weight_only
import torch
# ๋ชจ๋ธ ๋ก๋
pipe = StableDiffusionXLPipeline.from_pretrained(
"data_fp8/novaAnimeXL_fp8",
torch_dtype=torch.float16,
)
# GPU๋ก ์ด๋
pipe = pipe.to("cuda")
# FLOAT8 ์์ํ ์ ์ฉ (๋ฐํ์)
quantize_(pipe.unet, float8_weight_only())
quantize_(pipe.vae, float8_weight_only())
# ์ด๋ฏธ์ง ์์ฑ
prompt = "a beautiful landscape"
image = pipe(prompt).images[0]
image.save("output.png")
์ฃผ์์ฌํญ
- torchao ์์ํ๋ ๋ฉ๋ชจ๋ฆฌ์์์๋ง ์ ์ฉ๋ฉ๋๋ค
- ๋ชจ๋ธ ํ์ผ ์์ฒด๋ FP16 ์๋ณธ์ ๋๋ค (safetensors)
- ๋งค๋ฒ ๋ก๋ฉํ ๋๋ง๋ค ์์ํ๋ฅผ ๋ค์ ์ ์ฉํด์ผ ํฉ๋๋ค
- ์ด๋ ๊ฒ ํด๋ ๋ฉ๋ชจ๋ฆฌ์ ์๋ ์ด์ ์ ๋์ผํฉ๋๋ค
- ์๋ณธ ๋ชจ๋ธ ๋๋น ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ด ๊ฐ์ํ๊ณ ์ถ๋ก ์๋๊ฐ ํฅ์๋ ์ ์์ต๋๋ค
๋ณํ ๋๊ตฌ
- Script:
convert_hf_to_fp8_torchao.py - Library: torchao