sumink
/

ifd

sumink commited on Apr 7, 2025

Commit

8e8b432

verified ·

1 Parent(s): b696e7c

Create README.md

Files changed (1) hide show

README.md ADDED Viewed

+# Model: LLaMA (IFD Top 30%)
+## 🔍 Purpose
+Fine-tune `meta-llama/Llama-3.2-1B` on instruction samples with the **highest Instruction Flow Density (IFD)**.
+This group includes samples where the instruction contributes **least** to the model’s output (i.e., high IFD).
+## 📂 Dataset
+- `alpaca2000.csv`
+  - IFD score 상위 30% (2000개 중 600개)
+  - 기준: `PPL(y | x) / PPL(y)` (x: instruction+input, y: output)
+## ⚙️ Training Config
+- Model: `meta-llama/Llama-3.2-1B`
+- Precision: `bf16` or `float32`
+- Epochs: 3
+- Max length: 2048
+- Output: `output/llama_ifd`
+## 🧪 Goal
+Establish baseline performance of high-IFD samples, before splitting by instruction entropy.