YEJI-8B-RSLoRA-v7-AWQ
yeji-8b-rslora-v7์ AWQ 4-bit ์์ํ ๋ฒ์ .
Model Description
YEJI 8B ์ฃผ๋ ฅ ๋ชจ๋ธ์ AWQ ์์ํ ๋ฒ์ ์ ๋๋ค. ์๋ณธ 8.2B ํ๋ผ๋ฏธํฐ๋ฅผ 4-bit๋ก ์์ํํ์ฌ GPU ๋ฉ๋ชจ๋ฆฌ ์ฌ์ฉ๋์ ๋ํญ ์ค์ด๊ณ vLLM ์๋น์ ์ต์ ํํ์ต๋๋ค.
- Original Model: tellang/yeji-8b-rslora-v7
- Quantization: AWQ 4-bit (group_size=128)
- Optimized for: vLLM, TGI
Usage
vLLM (๊ถ์ฅ)
python -m vllm.entrypoints.openai.api_server \
--model tellang/yeji-8b-rslora-v7-AWQ \
--quantization awq \
--max-model-len 4096
Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"tellang/yeji-8b-rslora-v7-AWQ",
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("tellang/yeji-8b-rslora-v7-AWQ")
All YEJI Models
| Model | Params | Type | Downloads |
|---|---|---|---|
| yeji-8b-rslora-v7 | 8.2B | Full | 345 |
| yeji-8b-rslora-v7-AWQ | ~2.5B | AWQ 4-bit | 371 |
| yeji-4b-instruct-v9 | 4.0B | Full | 65 |
| yeji-4b-instruct-v9-AWQ | ~1.3B | AWQ 4-bit | 138 |
Limitations
- ํ๊ตญ์ด ์ด์ธ/์ ์ ๋๋ฉ์ธ ํนํ. AWQ ์์ํ๋ก ์๋ณธ ๋๋น ๋ฏธ์ธํ ํ์ง ์ ํ ๊ฐ๋ฅ.
- ์ ์ ๊ฒฐ๊ณผ๋ ์ํฐํ ์ธ๋จผํธ ๋ชฉ์ ์ ๋๋ค.
- Downloads last month
- 6