session-vault-qwen35-9b-gguf

Qwen 3.5 9B๋ฅผ ํ•œ๊ตญ์–ด ๊ฐœ๋ฐœ ์„ธ์…˜ ์š”์•ฝ ํƒœ์Šคํฌ์— LoRA ํŒŒ์ธํŠœ๋‹ํ•œ ๋ชจ๋ธ์˜ GGUF ์–‘์žํ™” ๋ฒ„์ „.

ํ•™์Šต ์ •๋ณด

ํ•ญ๋ชฉ ๊ฐ’
Base model Qwen/Qwen3.5-9B
Method LoRA 16-bit (r=32, alpha=64)
Hardware NVIDIA H200 NVL x2 (287GB VRAM)
Dataset 196๊ฐœ Claude ์„ธ์…˜ ์š”์•ฝ ์Œ (avg quality 88.8)
Epochs 3
Loss 1.36 โ†’ 0.77
Training time ~530s
Quantization Q4_K_M (llama.cpp)

ํƒœ์Šคํฌ

Claude Code ์„ธ์…˜ ๋กœ๊ทธ(raw markdown)๋ฅผ ๊ตฌ์กฐํ™”๋œ ํ•œ๊ตญ์–ด ์š”์•ฝ์œผ๋กœ ๋ณ€ํ™˜:

  • YAML frontmatter (ํ”„๋กœ์ ํŠธ, ํƒœ๊ทธ, ์œ ํ˜•)
  • ๋ณธ๋ฌธ ์„น์…˜: ๋ชฉ์  / ํ•ต์‹ฌ ๋ณ€๊ฒฝ์‚ฌํ•ญ / ๊ฒฐ๊ณผ / ๊ด€๋ จ ํŒŒ์ผ

Ollama ์‚ฌ์šฉ๋ฒ•

# Modelfile
cat > Modelfile << 'EOF'
FROM ./session-vault-9b-q4_k_m.gguf
PARAMETER num_ctx 8192
PARAMETER num_predict 6144
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER repeat_penalty 1.0
EOF

ollama create session-vault:9b -f Modelfile
ollama run session-vault:9b

๊ถŒ์žฅ ํŒŒ๋ผ๋ฏธํ„ฐ

ํŒŒ๋ผ๋ฏธํ„ฐ ๊ฐ’ ๋น„๊ณ 
num_ctx 8192 4096์€ ํƒ€์ž„์•„์›ƒ ์œ ๋ฐœ
num_predict 6144 thinking ํ† ํฐ ํฌํ•จ
temperature 0.7 Qwen 3.5 non-thinking ๊ณต์‹
top_p 0.8 ๊ณต์‹ ๊ถŒ์žฅ
top_k 20 ๊ณต์‹ ๊ถŒ์žฅ
repeat_penalty 1.0 ๊ณต์‹: ํ•ญ์ƒ 1.0
presence_penalty 1.5 /api/chat ์ „์šฉ
kv_cache_type q8_0 VRAM ์ ˆ๊ฐ (12GB GPU ๊ถŒ์žฅ)

ํ•™์Šต ๋ฐ์ดํ„ฐ

session-vault์—์„œ ์ž๋™ ์ƒ์„ฑ:

  • scripts/build_finetune_dataset.py: raw/summary ๋งค์นญ + ํ’ˆ์งˆ ํ•„ํ„ฐ๋ง
  • ์—์ด์ „ํŠธ/๋‹จ๊ธฐ ์„ธ์…˜ ์ž๋™ ์ œ์™ธ, score >= 60 ํ•„ํ„ฐ
  • Unsloth ํ˜ธํ™˜ conversations ํฌ๋งท (system/user/assistant)

๋ผ์ด์„ ์Šค

Apache 2.0 (Qwen 3.5 ๋ผ์ด์„ ์Šค ์ค€์ˆ˜)

Downloads last month
17
GGUF
Model size
9B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for tellang/session-vault-qwen35-9b-gguf

Finetuned
Qwen/Qwen3.5-9B
Adapter
(125)
this model