nickoo004 commited on
Commit
3cba60b
·
verified ·
1 Parent(s): 3c57609

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -7
README.md CHANGED
@@ -57,12 +57,13 @@ Final answer in Uzbek ✅
57
  | **Base model** | Qwen/Qwen2.5-1.5B-Instruct |
58
  | **Training data** | [QueryShield Multilingual Dataset](https://huggingface.co/datasets/nickoo004/queryshield-multilingual) |
59
  | **Training rows** | 19,530 |
60
- | **Epochs** | 3 |
61
  | **Train loss** | 0.88 → 0.47 |
62
- | **Eval loss** | 0.967 |
63
  | **GPU** | NVIDIA RTX 3090 24GB |
64
  | **Training time** | ~3.7 hours |
65
  | **Parameters** | 1.5B total / 147M trainable (8.7%) |
 
66
 
67
  ---
68
 
@@ -139,7 +140,7 @@ result = optimize_prompt(
139
  )
140
  print(result)
141
 
142
- # Example 2 — Cross-lingual: Kazakh Uzbek
143
  result = optimize_prompt(
144
  user_question="менің фермамда топырақ сапасы нашар, не істеуім керек?",
145
  input_language="Kazakh",
@@ -151,6 +152,14 @@ print(result)
151
 
152
  ---
153
 
 
 
 
 
 
 
 
 
154
  ## Supported Domains (30 total)
155
 
156
  | Domain | Expert Role |
@@ -181,11 +190,10 @@ print(result)
181
  - **19,530 rows** across 5 languages and 30 domains
182
  - Generated by DeepSeek, Gemini, and Qwen2.5-14B
183
 
184
-
185
  ### Loss Curve
186
  ```
187
- Epoch 1.0 train: 1.023 | eval: 0.997
188
- Epoch 2.5 train: 0.731 | eval: 0.967
189
  ```
190
 
191
  ---
@@ -216,4 +224,4 @@ Epoch 2.5 → train: 0.731 | eval: 0.967
216
  ## License
217
 
218
  This model is released under the **MIT License**.
219
- Base model license: [Qwen License](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE)
 
57
  | **Base model** | Qwen/Qwen2.5-1.5B-Instruct |
58
  | **Training data** | [QueryShield Multilingual Dataset](https://huggingface.co/datasets/nickoo004/queryshield-multilingual) |
59
  | **Training rows** | 19,530 |
60
+ | **Epochs** | 3 |
61
  | **Train loss** | 0.88 → 0.47 |
62
+ | **Eval loss** | 0.967 (best checkpoint) |
63
  | **GPU** | NVIDIA RTX 3090 24GB |
64
  | **Training time** | ~3.7 hours |
65
  | **Parameters** | 1.5B total / 147M trainable (8.7%) |
66
+ | **Live demo** | [▶ Kaggle Notebook](https://www.kaggle.com/code/nursultankoshekbaev/queryshield-1-5b) |
67
 
68
  ---
69
 
 
140
  )
141
  print(result)
142
 
143
+ # Example 2 — Cross-lingual: Kazakh -> Uzbek
144
  result = optimize_prompt(
145
  user_question="менің фермамда топырақ сапасы нашар, не істеуім керек?",
146
  input_language="Kazakh",
 
152
 
153
  ---
154
 
155
+ ## Live Demo
156
+
157
+ **[▶ Run on Kaggle](https://www.kaggle.com/code/nursultankoshekbaev/queryshield-1-5b)** — no setup needed, free GPU included.
158
+
159
+ Tests all 7 cases: English, Uzbek, Russian, Kazakh, Karakalpak + 2 cross-lingual pairs.
160
+
161
+ ---
162
+
163
  ## Supported Domains (30 total)
164
 
165
  | Domain | Expert Role |
 
190
  - **19,530 rows** across 5 languages and 30 domains
191
  - Generated by DeepSeek, Gemini, and Qwen2.5-14B
192
 
 
193
  ### Loss Curve
194
  ```
195
+ Epoch 1.0 -> train: 1.023 | eval: 0.997
196
+ Epoch 2.5 -> train: 0.731 | eval: 0.967 <- best checkpoint
197
  ```
198
 
199
  ---
 
224
  ## License
225
 
226
  This model is released under the **MIT License**.
227
+ Base model license: [Qwen License](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct/blob/main/LICENSE)