yeji-8b-qlora-v2 (Deprecated)

โš ๏ธ ์ด ๋ชจ๋ธ์€ ๋” ์ด์ƒ ์‚ฌ์šฉ๋˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค. tellang/yeji-8b-rslora-v7-AWQ๋ฅผ ์‚ฌ์šฉํ•˜์„ธ์š”.

Why Deprecated?

์ด ๋ชจ๋ธ์€ ์ดˆ๊ธฐ ์‹คํ—˜ ๋‹จ๊ณ„์˜ QLoRA ๋ชจ๋ธ๋กœ, ๋‹ค์Œ ์ด์œ ๋กœ ํ๊ธฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค:

1. QLoRA์˜ ํ•œ๊ณ„

# QLoRA ํ•™์Šต ์„ค์ •
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_use_double_quant=True,
)

๋ฌธ์ œ:

  • 4-bit ์–‘์žํ™”๋กœ ์ธํ•œ ์ •ํ™•๋„ ์†์‹ค
  • rsLoRA ๋Œ€๋น„ 10-15% ๋‚ฎ์€ ์„ฑ๋Šฅ
  • ํ•™์Šต ์ค‘ ์ˆ˜์น˜ ๋ถˆ์•ˆ์ •์„ฑ (NaN ๋ฐœ์ƒ)

2. ์ดˆ๊ธฐ ์‹คํ—˜ ๋ชจ๋ธ

v2๋Š” ํ”„๋กœ์ ํŠธ ์ดˆ๊ธฐ ๋‹จ๊ณ„์˜ ์‹คํ—˜ ๋ชจ๋ธ๋กœ:

  • ์ž‘์€ ํ•™์Šต ๋ฐ์ดํ„ฐ (1,000 ์ƒ˜ํ”Œ)
  • ํ•˜์ดํผํŒŒ๋ผ๋ฏธํ„ฐ ํŠœ๋‹ ๋ฏธ์™„๋ฃŒ
  • ํ”„๋กฌํ”„ํŠธ ์—”์ง€๋‹ˆ์–ด๋ง ๋ฏธ์ ์šฉ
  • ํ”„๋กœ๋•์…˜ ํ’ˆ์งˆ ๋ฏธ๋‹ฌ

3. rsLoRA ๋Œ€๋น„ ์„ฑ๋Šฅ ์ €ํ•˜

์ง€ํ‘œ v2 (QLoRA) v7 (rsLoRA)
์ •ํ™•๋„ Baseline +25%
ํ•™์Šต ์•ˆ์ •์„ฑ ๋ถˆ์•ˆ์ • (NaN ๋ฐœ์ƒ) ์•ˆ์ •์ 
์ถ”๋ก  ์†๋„ 20 tokens/s 50 tokens/s

Technical Details

  • ๋ฒ ์ด์Šค ๋ชจ๋ธ: Qwen/Qwen3-8B-Base
  • ํŒŒ์ธํŠœ๋‹ ๋ฐฉ์‹: QLoRA (4-bit quantization)
  • ํ•™์Šต ๋ฐ์ดํ„ฐ: 1,000 ์ƒ˜ํ”Œ (์‹คํ—˜์šฉ)
  • Rank: 8
  • Alpha: 16
  • Quantization: 4-bit NF4

Recommended Alternative

ํ”„๋กœ๋•์…˜ ์‚ฌ์šฉ

  • ๋ชจ๋ธ: tellang/yeji-8b-rslora-v7-AWQ
  • ๊ฐœ์„ : rsLoRA + AWQ ์–‘์žํ™”๋กœ ์„ฑ๋Šฅ๊ณผ ํšจ์œจ ๋ชจ๋‘ ํ–ฅ์ƒ
from vllm import LLM, SamplingParams

# QLoRA v2 ๋Œ€์‹  rsLoRA v7-AWQ ์‚ฌ์šฉ
llm = LLM(
    model="tellang/yeji-8b-rslora-v7-AWQ",
    quantization="awq",  # 4-bit AWQ (QLoRA์˜ NF4๋ณด๋‹ค ์šฐ์ˆ˜)
    gpu_memory_utilization=0.9,
)

์ตœ์‹  ๋ฒ„์ „ (2026-02-01)

  • 4B ๋ชจ๋ธ: tellang/yeji-4b-rslora-v8.1
  • 8B ๋ชจ๋ธ: tellang/yeji-8b-rslora-v7-AWQ

Performance Comparison

์ง€ํ‘œ v2 (QLoRA) v7-AWQ (rsLoRA + AWQ)
์ •ํ™•๋„ 65% 90%
ํ•™์Šต ์•ˆ์ •์„ฑ โŒ NaN ๋ฐœ์ƒ โœ… ์•ˆ์ •์ 
์ถ”๋ก  ์†๋„ 20 tokens/s 50 tokens/s
๋ฉ”๋ชจ๋ฆฌ 4.5GB 5.3GB (AWQ)
์–‘์žํ™” ๋ฐฉ์‹ 4-bit NF4 4-bit AWQ

QLoRA vs rsLoRA

QLoRA

# QLoRA - 4-bit ์–‘์žํ™” ์ค‘ ํ•™์Šต
- ๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ์ 
- ํ•™์Šต ์ค‘ ์–‘์žํ™” โ†’ ์ˆ˜์น˜ ๋ถˆ์•ˆ์ •
- NaN gradient ๋ฐœ์ƒ ์œ„ํ—˜

rsLoRA (v7 ๋ฐฉ์‹)

# rsLoRA - Full precision ํ•™์Šต โ†’ AWQ ์–‘์žํ™”
- ํ•™์Šต ์•ˆ์ •์„ฑ ๋ณด์žฅ
- ํ•™์Šต ํ›„ AWQ๋กœ ์–‘์žํ™”
- ์ •ํ™•๋„์™€ ํšจ์œจ ๋ชจ๋‘ ํ™•๋ณด

Migration Guide

Before (v2 - QLoRA)

# v2 - QLoRA (๋น„์ถ”์ฒœ)
from transformers import AutoModelForCausalLM, BitsAndBytesConfig

model = AutoModelForCausalLM.from_pretrained(
    "tellang/yeji-8b-qlora-v2",
    load_in_4bit=True,
)

After (v7-AWQ - rsLoRA)

# v7-AWQ - rsLoRA + AWQ (๊ถŒ์žฅ)
from vllm import LLM

llm = LLM(
    model="tellang/yeji-8b-rslora-v7-AWQ",
    quantization="awq",  # AWQ๊ฐ€ NF4๋ณด๋‹ค ์šฐ์ˆ˜
)

Why rsLoRA Won?

์ธก๋ฉด QLoRA (v2) rsLoRA (v7)
ํ•™์Šต ์‹œ์  ์–‘์žํ™” โœ… 4-bit (๋ฉ”๋ชจ๋ฆฌ ์ ˆ์•ฝ) โŒ Full precision
ํ•™์Šต ์•ˆ์ •์„ฑ โŒ NaN ๋ฐœ์ƒ โœ… ์•ˆ์ •์ 
์ถ”๋ก  ์‹œ์  ์–‘์žํ™” 4-bit NF4 โœ… 4-bit AWQ (๋” ์ •ํ™•)
์ตœ์ข… ์„ฑ๋Šฅ ๋‚ฎ์Œ ๋†’์Œ
ํ”„๋กœ๋•์…˜ ์‚ฌ์šฉ โŒ โœ…

๊ฒฐ๋ก : rsLoRA๋Š” ํ•™์Šต ์•ˆ์ •์„ฑ์„ ํ™•๋ณดํ•˜๊ณ , ์ถ”๋ก  ์‹œ AWQ๋กœ ์–‘์žํ™”ํ•˜์—ฌ QLoRA์˜ ์žฅ์ (๋ฉ”๋ชจ๋ฆฌ ํšจ์œจ)์„ ๋ชจ๋‘ ๊ฐ€์ ธ์˜ด

References

License

Apache 2.0

Citation

@misc{yeji-8b-qlora-v2,
  title={YEJI Fortune Telling Model (QLoRA v2 - Deprecated)},
  author={SSAFY YEJI Team},
  year={2026},
  note={Deprecated: Early experiment. Use yeji-8b-rslora-v7-AWQ instead}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tellang/yeji-8b-qlora-v2

Finetuned
(380)
this model

Papers for tellang/yeji-8b-qlora-v2