π°π·βοΈπΊπΈ LFM2-v8-rl-10k-merged-GGUF
LFM2-v8-rl-10k-mergedμ GGUF μμν λ²μ
CPU/GPUμμ llama.cppλ‘ λΉ λ₯Έ μΆλ‘ κ°λ₯!
π μμνλ³ μ±λ₯ (1012κ° μλ λΆμ)
κ²°λ‘ : 4/5/8λΉνΈ λͺ¨λ fp32μ μ¬μ€μ λμΌ!
| Quantization | CHrF++ | BLEU | Size | fp32 λλΉ |
|---|---|---|---|---|
| fp32 (μλ³Έ) | 34.32 | 13.10 | 4.68G | - |
| Q8_0 π | 34.39 | 12.93 | 1.25G | +0.07 |
| Q5_K_M | 34.08 | 12.78 | 843M | -0.24 |
| Q4_K_M | 33.97 | 12.56 | 731M | -0.35 |
π 1012κ° μμ μλ κ²ν κ²°κ³Ό
ν΅μ¬ λ°κ²¬:
- 90% μ΄μ: λͺ¨λ μμν λ²μ μμ μλ―Έμ μΌλ‘ λμΌν λ²μ
- μ°¨μ΄μ : λ¨μ΄ μ ν μ°¨μ΄λ§ μ‘΄μ¬
- μ: "μ μνλ€" vs "λ§νλ€" vs "μΈκΈνλ€"
- νκ° ν¨ν΄: μμν μμ€κ³Ό 무κ΄νκ² λμΌνκ² λ°μ
- "George W. Bush" β "μ‘°μ§ μμ±ν΄" (λͺ¨λ λ²μ )
- "cheetahs" β "κΈ°λ¦°" λλ "νΈλμ΄" (λͺ¨λ λ²μ )
μμνκ° λ²μ νμ§μ λ―ΈμΉλ μν₯: κ±°μ μμ!
| λΉκ΅ νλͺ© | Q4 vs Q8 | Q8 vs fp32 |
|---|---|---|
| μλ―Έ μ°¨μ΄ | β μμ | β μμ |
| λ¨μ΄ μ ν | μ½κ° λ€λ¦ | κ±°μ λμΌ |
| νκ° λΉλ | λμΌ | λμΌ |
| λ°λ³΅ λ²κ·Έ | β μμ | fp32λ§ λ°μ |
β οΈ μ€νλ € fp32 mergedμμ 0.1% λ―Έλ§ νλ₯ λ‘ λ°λ³΅ μΆλ ₯ λ²κ·Έ λ°μ! GGUF μμν λ²μ μ΄ λ μμ μ μ λλ€.
π¦ μ¬μ© κ°λ₯ν νμΌ
| νμΌ | ν¬κΈ° | μΆμ² μ©λ |
|---|---|---|
*-Q8_0.gguf |
1.25G | νμ§+μμ μ± μ΅μ°μ π |
*-Q5_K_M.gguf |
843M | κ· ν μΆμ² |
*-Q4_K_M.gguf |
731M | κ²½λν/λͺ¨λ°μΌ |
π μ¬μ©λ²
llama-cpp-python (Python)
from llama_cpp import Llama
from huggingface_hub import hf_hub_download
# λͺ¨λΈ λ€μ΄λ‘λ (Q8_0 μΆμ²)
model_path = hf_hub_download(
"gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged-GGUF",
"lfm2-1.2b-koen-mt-v8-rl-10k-merged-Q8_0.gguf"
)
# λͺ¨λΈ λ‘λ
llm = Llama(
model_path=model_path,
n_ctx=4096,
n_gpu_layers=-1, # GPU μ¬μ© (-1: μ 체 λ μ΄μ΄)
verbose=False
)
def translate(text, direction="en2ko"):
if direction == "en2ko":
system = "Translate the following text to Korean."
else:
system = "Translate the following text to English."
prompt = f"""<|im_start|>system
{system}<|im_end|>
<|im_start|>user
{text}<|im_end|>
<|im_start|>assistant
"""
output = llm(prompt, max_tokens=256, stop=["<|im_end|>"], temperature=0.3)
return output['choices'][0]['text'].strip()
# μ¬μ© μμ
print(translate("The weather is beautiful today."))
# β μ€λ λ μ¨κ° μ λ§ μλ¦λ΅μ΅λλ€.
print(translate("νκ΅ μμμ΄ μ λ§ λ§μμ΄μ.", "ko2en"))
# β Korean food is really delicious.
Colabμμ GPU μ¬μ©
# 1. CUDA μ§μ llama-cpp-python μ€μΉ (μ€μ!)
!pip uninstall llama-cpp-python -y
!pip install llama-cpp-python==0.3.16 \
--extra-index-url https://github.com/abetlen/llama-cpp-python/releases/download/v0.3.16-cu124
# 2. μ μ½λ μ€ν
llama.cpp CLI
llama-cli -hf gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged-GGUF \
-p "Translate to Korean: Hello world"
π‘ μ GGUFκ° λ μ’μκ°?
| νλͺ© | fp32/fp16 | GGUF Q8_0 |
|---|---|---|
| ν¬κΈ° | 4.68GB | 1.25GB (3.7λ°° μμ) |
| μ±λ₯ | CHrF++ 34.32 | CHrF++ 34.39 (λλ±+) |
| μμ μ± | λ°λ³΅ λ²κ·Έ μμ | β μμ μ |
| μΆλ‘ μλ | GPU νμ | CPUμμλ λΉ λ¦ |
| μ©λ | μΆκ° νμ΅μ© | μ€μ μλΉμ© |
π κ΄λ ¨ λ§ν¬
- μλ³Έ λͺ¨λΈ: gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged
- νλ‘μ νΈ: GitHub - LFM2-KoEn-Tuning
π λΌμ΄μ μ€
- Base Model: LiquidAI/LFM2-1.2B (LFM Open License v1.0)
- Developed by: Gyung (Kiwoong)
- Downloads last month
- 20
Hardware compatibility
Log In to add your hardware
Model tree for gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged-GGUF
Base model
LiquidAI/LFM2-1.2B Finetuned
gyung/lfm2-1.2b-koen-mt-v8-rl-10k-merged