Update README.md
Browse files
README.md
CHANGED
|
@@ -57,12 +57,13 @@ Final answer in Uzbek ✅
|
|
| 57 |
| **Base model** | Qwen/Qwen2.5-1.5B-Instruct |
|
| 58 |
| **Training data** | [QueryShield Multilingual Dataset](https://huggingface.co/datasets/nickoo004/queryshield-multilingual) |
|
| 59 |
| **Training rows** | 19,530 |
|
| 60 |
-
| **Epochs** | 3
|
| 61 |
| **Train loss** | 0.88 → 0.47 |
|
| 62 |
-
| **Eval loss** | 0.967 |
|
| 63 |
| **GPU** | NVIDIA RTX 3090 24GB |
|
| 64 |
| **Training time** | ~3.7 hours |
|
| 65 |
| **Parameters** | 1.5B total / 147M trainable (8.7%) |
|
|
|
|
| 66 |
|
| 67 |
---
|
| 68 |
|
|
@@ -139,7 +140,7 @@ result = optimize_prompt(
|
|
| 139 |
)
|
| 140 |
print(result)
|
| 141 |
|
| 142 |
-
# Example 2 — Cross-lingual: Kazakh
|
| 143 |
result = optimize_prompt(
|
| 144 |
user_question="менің фермамда топырақ сапасы нашар, не істеуім керек?",
|
| 145 |
input_language="Kazakh",
|
|
@@ -151,6 +152,14 @@ print(result)
|
|
| 151 |
|
| 152 |
---
|
| 153 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 154 |
## Supported Domains (30 total)
|
| 155 |
|
| 156 |
| Domain | Expert Role |
|
|
@@ -181,11 +190,10 @@ print(result)
|
|
| 181 |
- **19,530 rows** across 5 languages and 30 domains
|
| 182 |
- Generated by DeepSeek, Gemini, and Qwen2.5-14B
|
| 183 |
|
| 184 |
-
|
| 185 |
### Loss Curve
|
| 186 |
```
|
| 187 |
-
Epoch 1.0
|
| 188 |
-
Epoch 2.5
|
| 189 |
```
|
| 190 |
|
| 191 |
---
|
|
@@ -216,4 +224,4 @@ Epoch 2.5 → train: 0.731 | eval: 0.967
|
|
| 216 |
## License
|
| 217 |
|
| 218 |
This model is released under the **MIT License**.
|
| 219 |
-
Base model license: [Qwen License](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE)
|
|
|
|
| 57 |
| **Base model** | Qwen/Qwen2.5-1.5B-Instruct |
|
| 58 |
| **Training data** | [QueryShield Multilingual Dataset](https://huggingface.co/datasets/nickoo004/queryshield-multilingual) |
|
| 59 |
| **Training rows** | 19,530 |
|
| 60 |
+
| **Epochs** | 3 |
|
| 61 |
| **Train loss** | 0.88 → 0.47 |
|
| 62 |
+
| **Eval loss** | 0.967 (best checkpoint) |
|
| 63 |
| **GPU** | NVIDIA RTX 3090 24GB |
|
| 64 |
| **Training time** | ~3.7 hours |
|
| 65 |
| **Parameters** | 1.5B total / 147M trainable (8.7%) |
|
| 66 |
+
| **Live demo** | [▶ Kaggle Notebook](https://www.kaggle.com/code/nursultankoshekbaev/queryshield-1-5b) |
|
| 67 |
|
| 68 |
---
|
| 69 |
|
|
|
|
| 140 |
)
|
| 141 |
print(result)
|
| 142 |
|
| 143 |
+
# Example 2 — Cross-lingual: Kazakh -> Uzbek
|
| 144 |
result = optimize_prompt(
|
| 145 |
user_question="менің фермамда топырақ сапасы нашар, не істеуім керек?",
|
| 146 |
input_language="Kazakh",
|
|
|
|
| 152 |
|
| 153 |
---
|
| 154 |
|
| 155 |
+
## Live Demo
|
| 156 |
+
|
| 157 |
+
**[▶ Run on Kaggle](https://www.kaggle.com/code/nursultankoshekbaev/queryshield-1-5b)** — no setup needed, free GPU included.
|
| 158 |
+
|
| 159 |
+
Tests all 7 cases: English, Uzbek, Russian, Kazakh, Karakalpak + 2 cross-lingual pairs.
|
| 160 |
+
|
| 161 |
+
---
|
| 162 |
+
|
| 163 |
## Supported Domains (30 total)
|
| 164 |
|
| 165 |
| Domain | Expert Role |
|
|
|
|
| 190 |
- **19,530 rows** across 5 languages and 30 domains
|
| 191 |
- Generated by DeepSeek, Gemini, and Qwen2.5-14B
|
| 192 |
|
|
|
|
| 193 |
### Loss Curve
|
| 194 |
```
|
| 195 |
+
Epoch 1.0 -> train: 1.023 | eval: 0.997
|
| 196 |
+
Epoch 2.5 -> train: 0.731 | eval: 0.967 <- best checkpoint
|
| 197 |
```
|
| 198 |
|
| 199 |
---
|
|
|
|
| 224 |
## License
|
| 225 |
|
| 226 |
This model is released under the **MIT License**.
|
| 227 |
+
Base model license: [Qwen License](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE)
|