asdf98 commited on
Commit
8ecbd0a
Β·
verified Β·
1 Parent(s): 9531efa

Upload EthicalHacking_Qwen3-4B_Ultimate_Colab.ipynb

Browse files
EthicalHacking_Qwen3-4B_Ultimate_Colab.ipynb CHANGED
@@ -4,27 +4,14 @@
4
  "cell_type": "markdown",
5
  "metadata": {},
6
  "source": [
7
- "# πŸ” Ultimate Ethical Hacking LLM – Colab Free Tier (T4)\n",
8
  "\n",
9
  "**πŸ₯‡ Model:** [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) via Unsloth 4-bit \n",
10
- "**πŸ† Why this model?** Highest coding/reasoning scores among sub-10B models with confirmed Unsloth support (LiveCodeBench 35.1, MMLU-Pro 69.6). Only **3.3 GB** in 4-bit β€” massive VRAM headroom on T4. \n",
11
- "**πŸ“Š Datasets:** [Fenrir v2.1](https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1) + [Trendyol Cybersecurity](https://huggingface.co/datasets/Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset) β€” 153K+ instruction pairs \n",
12
  "**⚑ Framework:** Unsloth + TRL SFTTrainer β€” 2Γ— faster, 70% less VRAM \n",
13
  "\n",
14
- "> ⚠️ **Disclaimer:** This trains on **defensive cybersecurity** datasets only (pentesting education, threat analysis, CTF write-ups, incident response). Intended for **ethical hacking education and security research**.\n",
15
- "\n",
16
- "---\n",
17
- "\n",
18
- "## πŸš€ Speed Optimizations Applied (vs v1)\n",
19
- "\n",
20
- "| Setting | v1 (slow) | v2 (this notebook) | Why |\n",
21
- "|---------|-----------|-------------------|-----|\n",
22
- "| Dataset size | 153K rows | **50K rows** (sampled) | LoRA converges fast; 50K is plenty |\n",
23
- "| Batch size | 2 | **4** | You have 11GB free VRAM! |\n",
24
- "| Grad accum | 4 | **2** | Effective batch still = 8 |\n",
25
- "| Packing | False | **True** | 2-3Γ— throughput boost |\n",
26
- "| Max steps | Full epoch (19K) | **4,000** | Loss plateaus ~0.70 by step 300 |\n",
27
- "| **Est. time** | ~45 hrs | **~3-4 hrs** | Same quality, massively faster |\n",
28
  "\n",
29
  "---\n",
30
  "\n",
@@ -32,13 +19,12 @@
32
  "\n",
33
  "| Setting | Value | Why |\n",
34
  "|---------|-------|-----|\n",
35
- "| `MAX_SEQ_LENGTH` | 4096 | Qwen3-4B has huge headroom on T4 |\n",
36
- "| `LORA_R` | 64 | Can afford higher rank thanks to small base model |\n",
37
- "| `BATCH_SIZE` | 4 | You have 11GB free after base model loads |\n",
38
  "| `GRAD_ACCUM` | 2 | Effective batch = 8 |\n",
39
- "| `PACKING` | True | 2-3Γ— speedup for short chat examples |\n",
40
  "| `optim` | `adamw_8bit` | Massive VRAM saver |\n",
41
- "| `dtype` | fp16 | T4 has no bf16 |\n",
42
  "\n",
43
  "If you still hit OOM β†’ lower `MAX_SEQ_LENGTH` to 3072 or set `use_rslora=True`."
44
  ]
@@ -47,9 +33,7 @@
47
  "cell_type": "markdown",
48
  "metadata": {},
49
  "source": [
50
- "## 1️⃣ Install Dependencies\n",
51
- "\n",
52
- "Unsloth + TRL + Datasets. Takes ~3–5 min on Colab."
53
  ]
54
  },
55
  {
@@ -66,12 +50,7 @@
66
  "cell_type": "markdown",
67
  "metadata": {},
68
  "source": [
69
- "## 2️⃣ (Optional) Login to HuggingFace Hub\n",
70
- "\n",
71
- "Needed if you want to **push the fine-tuned model** back to your HF account.\n",
72
- "\n",
73
- "- Get token: [hf.co/settings/tokens](https://huggingface.co/settings/tokens) \n",
74
- "- Create a model repo first (e.g. `your-username/cyber-qwen3-4b-lora`)"
75
  ]
76
  },
77
  {
@@ -88,12 +67,7 @@
88
  "cell_type": "markdown",
89
  "metadata": {},
90
  "source": [
91
- "## 3️⃣ Load Qwen3-4B-Instruct-2507 in 4-bit via Unsloth\n",
92
- "\n",
93
- "This is the **best small model for coding & reasoning** as of May 2026.\n",
94
- "- Already **instruct-tuned** β€” your cybersecurity LoRA builds on solid foundations.\n",
95
- "- **Thinking toggle** (`enable_thinking=True/False`) for deep chain-of-thought exploit analysis.\n",
96
- "- Only ~3.3 GB quantized β†’ leaves **~12 GB** for training on a T4."
97
  ]
98
  },
99
  {
@@ -106,26 +80,25 @@
106
  "import torch\n",
107
  "\n",
108
  "# ==================== T4-COLAB HYPERPARAMETERS ====================\n",
109
- "MAX_SEQ_LENGTH = 4096 # Qwen3-4B headroom on T4 is HUGE\n",
110
- "LORA_R = 64 # higher rank = more capacity for exploit patterns\n",
111
- "LORA_ALPHA = 64 # alpha = r is standard\n",
112
- "BATCH_SIZE = 4 # ← INCREASED: you have 11GB free VRAM!\n",
113
- "GRAD_ACCUM = 2 # ← REDUCED: effective batch still = 8\n",
114
- "LEARNING_RATE = 2e-4 # conservative LoRA LR\n",
115
- "NUM_EPOCHS = 1 # we'll cap with max_steps instead\n",
116
- "MAX_STEPS = 4000 # ← NEW: cap steps for speed (loss plateaus early)\n",
117
- "WARMUP_STEPS = 200 # ← INCREASED: more warmup for stability\n",
118
- "LOGGING_STEPS = 50 # ← INCREASED: less log spam\n",
119
- "SAVE_STEPS = 500 # ← save less often for speed\n",
120
- "PACKING = True # ← NEW: massive throughput boost!\n",
121
- "SAMPLE_SIZE = 50000 # ← NEW: subsample dataset for 3Γ— speedup\n",
122
- "HUB_MODEL_ID = \"your-username/cyber-qwen3-4b-lora\" # ← change before pushing\n",
123
  "# ==================================================================\n",
124
  "\n",
125
  "model, tokenizer = FastLanguageModel.from_pretrained(\n",
126
  " model_name=\"unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit\",\n",
127
  " max_seq_length=MAX_SEQ_LENGTH,\n",
128
- " dtype=None, # auto-detect (fp16 on T4)\n",
129
  " load_in_4bit=True,\n",
130
  ")\n",
131
  "\n",
@@ -135,29 +108,39 @@
135
  " target_modules=[\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n",
136
  " \"gate_proj\", \"up_proj\", \"down_proj\"],\n",
137
  " lora_alpha=LORA_ALPHA,\n",
138
- " lora_dropout=0, # 0 = fastest; data is large enough\n",
139
  " bias=\"none\",\n",
140
- " use_gradient_checkpointing=\"unsloth\", # ~30% VRAM reduction\n",
141
  " random_state=3407,\n",
142
- " use_rslora=False, # set True for even smaller VRAM footprint\n",
143
  " loftq_config=None,\n",
144
  ")\n",
145
  "\n",
146
  "trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
147
  "total = sum(p.numel() for p in model.parameters())\n",
148
- "print(f\"βœ… Qwen3-4B loaded. Trainable params: {trainable:,} / {total:,} ({100*trainable/total:.2f}%)\")\n",
149
- "print(f\"πŸ“Š Estimated VRAM used by base model: ~3.3 GB (4-bit)\")\n",
150
- "print(f\"πŸš€ Free VRAM for training: ~{15.64 - 4.12:.1f} GB (on T4 16GB)\")"
151
  ]
152
  },
153
  {
154
  "cell_type": "markdown",
155
  "metadata": {},
156
  "source": [
157
- "## 4️⃣ Load, Audit, Subsample & Merge Cybersecurity Datasets\n",
 
 
 
 
 
 
 
 
 
 
 
 
158
  "\n",
159
- "We load **two SOTA defensive-cybersecurity datasets**, audit them, **subsample to 50K rows** for speed,\n",
160
- "and convert to TRL `messages` format."
161
  ]
162
  },
163
  {
@@ -167,59 +150,162 @@
167
  "outputs": [],
168
  "source": [
169
  "from datasets import load_dataset, concatenate_datasets\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
170
  "import random\n",
171
  "\n",
172
- "# ---------- Dataset 1: Fenrir v2.1 (99,870 rows) ----------\n",
173
- "print(\"πŸ“₯ Loading Fenrir v2.1...\")\n",
174
- "ds1 = load_dataset(\"AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1\", split=\"train\")\n",
175
- "print(f\" Rows: {len(ds1)} | Columns: {ds1.column_names}\")\n",
176
- "\n",
177
- "# Quick audit: print 2 random samples\n",
178
- "for i in random.sample(range(len(ds1)), 2):\n",
179
- " print(f\"\\n--- Sample {i} ---\")\n",
180
- " print(f\"SYSTEM: {ds1[i]['system'][:120]}...\")\n",
181
- " print(f\"USER: {ds1[i]['user'][:120]}...\")\n",
182
- " print(f\"ASSIST: {ds1[i]['assistant'][:120]}...\")\n",
183
- "\n",
184
- "def fenrir_to_messages(example):\n",
185
- " return {\n",
186
- " \"messages\": [\n",
187
- " {\"role\": \"system\", \"content\": example[\"system\"]},\n",
188
- " {\"role\": \"user\", \"content\": example[\"user\"]},\n",
189
- " {\"role\": \"assistant\", \"content\": example[\"assistant\"]},\n",
190
- " ]\n",
191
- " }\n",
192
- "\n",
193
- "ds1 = ds1.map(fenrir_to_messages, remove_columns=ds1.column_names, batched=False)\n",
194
- "print(f\"βœ… Fenrir converted to messages. Sample roles: {[m['role'] for m in ds1[0]['messages']]}\")\n",
195
- "\n",
196
- "# ---------- Dataset 2: Trendyol (53,202 rows) ----------\n",
197
- "print(\"\\nπŸ“₯ Loading Trendyol Cybersecurity...\")\n",
198
- "ds2 = load_dataset(\"Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset\", split=\"train\")\n",
199
- "print(f\" Rows: {len(ds2)} | Columns: {ds2.column_names}\")\n",
200
- "\n",
201
- "def trendyol_to_messages(example):\n",
202
- " return {\n",
203
- " \"messages\": [\n",
204
- " {\"role\": \"system\", \"content\": example[\"system\"]},\n",
205
- " {\"role\": \"user\", \"content\": example[\"user\"]},\n",
206
- " {\"role\": \"assistant\", \"content\": example[\"assistant\"]},\n",
207
- " ]\n",
208
- " }\n",
209
- "\n",
210
- "ds2 = ds2.map(trendyol_to_messages, remove_columns=ds2.column_names, batched=False)\n",
211
- "print(f\"βœ… Trendyol converted to messages. Sample roles: {[m['role'] for m in ds2[0]['messages']]}\")\n",
212
- "\n",
213
- "# ---------- Merge & Subsample ----------\n",
214
- "train_dataset = concatenate_datasets([ds1, ds2])\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
215
  "print(f\"\\nπŸ“Š COMBINED DATASET: {len(train_dataset)} rows\")\n",
216
  "\n",
217
- "# Subsample for speed (50K is MORE than enough for LoRA domain tuning)\n",
 
 
 
 
 
 
218
  "if len(train_dataset) > SAMPLE_SIZE:\n",
219
  " train_dataset = train_dataset.shuffle(seed=3407).select(range(SAMPLE_SIZE))\n",
220
- " print(f\"πŸš€ SUBSAMPLED to {len(train_dataset)} rows for fast training\")\n",
221
- "else:\n",
222
- " print(f\"βœ… Dataset is {len(train_dataset)} rows, no subsampling needed\")\n",
223
  "\n",
224
  "print(f\" Effective batch size: {BATCH_SIZE * GRAD_ACCUM}\")\n",
225
  "print(f\" Steps per epoch: ~{len(train_dataset) // (BATCH_SIZE * GRAD_ACCUM)}\")\n",
@@ -230,11 +316,9 @@
230
  "cell_type": "markdown",
231
  "metadata": {},
232
  "source": [
233
- "## 5️⃣ Pre-process Dataset to Text (Avoid Unsloth formatting_func issues)\n",
234
  "\n",
235
- "**⚠️ CRITICAL:** Unsloth's SFTTrainer has issues with `formatting_func`.\n",
236
- "The **cleanest fix** is to pre-convert `messages` β†’ `text` using `dataset.map(batched=True)`,\n",
237
- "then pass `dataset_text_field=\"text\"` to SFTTrainer. No `formatting_func` needed!"
238
  ]
239
  },
240
  {
@@ -243,28 +327,23 @@
243
  "metadata": {},
244
  "outputs": [],
245
  "source": [
246
- "# ========== PRE-PROCESS: messages β†’ text with chat template ==========\n",
247
  "def convert_messages_to_text(examples):\n",
248
- " \"\"\"\n",
249
- " Convert batched messages to formatted text strings using tokenizer chat template.\n",
250
- " Called with batched=True so examples[\"messages\"] is a list of conversations.\n",
251
- " \"\"\"\n",
252
  " texts = []\n",
253
  " for msgs in examples[\"messages\"]:\n",
254
  " text = tokenizer.apply_chat_template(\n",
255
  " msgs,\n",
256
- " tokenize=False, # return text string\n",
257
- " add_generation_prompt=False, # don't add assistant prompt at end\n",
258
  " )\n",
259
  " texts.append(text)\n",
260
  " return {\"text\": texts}\n",
261
  "\n",
262
- "print(\"πŸ”„ Converting messages to text with chat template (batched)...\")\n",
263
  "train_dataset = train_dataset.map(\n",
264
  " convert_messages_to_text,\n",
265
- " batched=True, # process multiple examples at once\n",
266
- " remove_columns=[\"messages\"], # drop old column, keep only \"text\"\n",
267
- " batch_size=100, # adjust based on your RAM\n",
268
  ")\n",
269
  "\n",
270
  "print(f\"βœ… Dataset pre-processed. Columns: {train_dataset.column_names}\")\n",
@@ -276,7 +355,7 @@
276
  "cell_type": "markdown",
277
  "metadata": {},
278
  "source": [
279
- "## 6️⃣ Configure SFT Trainer (with Packing + Speed Optimizations)"
280
  ]
281
  },
282
  {
@@ -292,54 +371,40 @@
292
  " model=model,\n",
293
  " tokenizer=tokenizer,\n",
294
  " train_dataset=train_dataset,\n",
295
- " dataset_text_field=\"text\", # ← standard text format, no formatting_func needed!\n",
296
  " max_seq_length=MAX_SEQ_LENGTH,\n",
297
- " dataset_num_proc=2, # 2 workers for tokenization\n",
298
- " packing=PACKING, # ← MASSIVE speedup for short chat examples\n",
299
  " args=TrainingArguments(\n",
300
  " per_device_train_batch_size=BATCH_SIZE,\n",
301
  " gradient_accumulation_steps=GRAD_ACCUM,\n",
302
  " warmup_steps=WARMUP_STEPS,\n",
303
- " max_steps=MAX_STEPS, # ← cap steps instead of full epoch\n",
304
- " # num_train_epochs=NUM_EPOCHS, # ← commented out: use max_steps instead\n",
305
  " learning_rate=LEARNING_RATE,\n",
306
- " fp16=True, # T4 = fp16 only (no bf16)\n",
307
  " logging_steps=LOGGING_STEPS,\n",
308
- " optim=\"adamw_8bit\", # huge VRAM saver\n",
309
  " weight_decay=0.01,\n",
310
  " lr_scheduler_type=\"linear\",\n",
311
  " seed=3407,\n",
312
  " output_dir=\"./outputs\",\n",
313
  " save_strategy=\"steps\",\n",
314
  " save_steps=SAVE_STEPS,\n",
315
- " save_total_limit=2, # keep only last 2 checkpoints\n",
316
- " report_to=\"none\", # change to \"tensorboard\" / \"wandb\" if desired\n",
317
- " # push_to_hub=True, # ← uncomment to auto-push during training\n",
318
- " # hub_model_id=HUB_MODEL_ID,\n",
319
- " # hub_strategy=\"every_save\",\n",
320
  " ),\n",
321
  ")\n",
322
  "\n",
323
- "print(f\"βœ… Trainer ready. Total steps: {MAX_STEPS}\")\n",
324
  "print(f\" Effective batch size: {BATCH_SIZE * GRAD_ACCUM}\")\n",
325
- "print(f\" Packing enabled: {PACKING}\")\n",
326
- "print(f\" Dataset samples: {len(train_dataset)}\")\n",
327
- "print(f\" Est. time at ~0.3 it/s: ~{MAX_STEPS * 3 / 3600:.1f} hours\")"
328
  ]
329
  },
330
  {
331
  "cell_type": "markdown",
332
  "metadata": {},
333
  "source": [
334
- "## 7️⃣ Train πŸš€\n",
335
- "\n",
336
- "Expected time on **Google Colab Free Tier (T4)**: **~3–4 hours** for 4,000 steps.\n",
337
- "\n",
338
- "If you see `CUDA out of memory`:\n",
339
- "1. Lower `MAX_SEQ_LENGTH` to 3072 or 2048\n",
340
- "2. Set `BATCH_SIZE = 2`\n",
341
- "3. Set `PACKING = False`\n",
342
- "4. Set `use_rslora=True` in the LoRA config (cell 3)"
343
  ]
344
  },
345
  {
@@ -348,9 +413,8 @@
348
  "metadata": {},
349
  "outputs": [],
350
  "source": [
351
- "# Optional: memory stats before training\n",
352
  "if torch.cuda.is_available():\n",
353
- " print(f\"VRAM before train: {torch.cuda.memory_allocated()/1e9:.2f} GB / {torch.cuda.get_device_properties(0).total_memory/1e9:.2f} GB\")\n",
354
  "\n",
355
  "trainer_stats = trainer.train()\n",
356
  "\n",
@@ -365,13 +429,7 @@
365
  "cell_type": "markdown",
366
  "metadata": {},
367
  "source": [
368
- "## 8️⃣ Save & Push to HuggingFace Hub\n",
369
- "\n",
370
- "We save:\n",
371
- "1. **LoRA adapter only** (~50–100 MB) β€” fast, easy to share.\n",
372
- "2. **Merged 16-bit model** (~8 GB) β€” ready for inference without Unsloth loaded.\n",
373
- "\n",
374
- "Pick whichever fits your use-case."
375
  ]
376
  },
377
  {
@@ -380,39 +438,33 @@
380
  "metadata": {},
381
  "outputs": [],
382
  "source": [
383
- "# 8A) Save LoRA adapter (tiny, fast)\n",
384
- "model.save_pretrained(\"./cyber-lora-adapter\")\n",
385
- "tokenizer.save_pretrained(\"./cyber-lora-adapter\")\n",
386
- "print(\"βœ… LoRA adapter saved to ./cyber-lora-adapter\")\n",
387
- "\n",
388
- "# 8B) Optional: merge & save full 16-bit model\n",
389
- "# ⚠️ Needs ~8 GB RAM. On Colab it may swap to CPU; still works but slower.\n",
390
- "print(\"\\nπŸ”„ Merging LoRA into base model (this may take a minute)...\")\n",
391
  "merged_model = model.merge_and_unload()\n",
392
- "merged_model.save_pretrained(\"./cyber-qwen3-4b-merged\")\n",
393
- "tokenizer.save_pretrained(\"./cyber-qwen3-4b-merged\")\n",
394
- "print(\"βœ… Merged 16-bit model saved to ./cyber-qwen3-4b-merged\")\n",
395
  "\n",
396
- "# 8C) Push LoRA adapter to HF Hub (uncomment if you logged in at step 2)\n",
397
  "# model.push_to_hub(HUB_MODEL_ID)\n",
398
- "# tokenizer.push_to_hub(HUB_MODEL_ID)\n",
399
- "# print(f\"πŸš€ Pushed to https://huggingface.co/{HUB_MODEL_ID}\")"
400
  ]
401
  },
402
  {
403
  "cell_type": "markdown",
404
  "metadata": {},
405
  "source": [
406
- "## 9️⃣ Inference Demo – Qwen3 Thinking Toggle\n",
407
- "\n",
408
- "Qwen3 has a unique **thinking mode** switch. Use it for different tasks:\n",
409
  "\n",
410
  "| Mode | Use Case | Speed |\n",
411
  "|------|----------|-------|\n",
412
- "| `enable_thinking=True` | Deep exploit analysis, CTF walkthroughs, reverse-engineering | Slower, more thorough |\n",
413
- "| `enable_thinking=False` | Quick lookups, syntax checks, tool commands | Fast, direct |\n",
414
- "\n",
415
- "Below we test both modes on a responsible pentesting question."
416
  ]
417
  },
418
  {
@@ -421,101 +473,35 @@
421
  "metadata": {},
422
  "outputs": [],
423
  "source": [
424
- "FastLanguageModel.for_inference(model) # enable 2Γ— faster inference\n",
425
  "\n",
426
- "test_prompt = (\n",
427
- " \"How would you perform a responsible penetration test on a web application? \"\n",
428
- " \"List the phases, key tools, and how to document findings for the development team.\"\n",
429
- ")\n",
430
  "\n",
431
  "messages = [\n",
432
- " {\"role\": \"system\", \"content\": \"You are a cybersecurity expert. Explain concepts clearly and ethically.\"},\n",
433
  " {\"role\": \"user\", \"content\": test_prompt},\n",
434
  "]\n",
435
  "\n",
436
  "for think_mode in [True, False]:\n",
437
- " label = \"🧠 THINKING=ON (deep analysis)\" if think_mode else \"⚑ THINKING=OFF (fast direct)\"\n",
438
  " print(f\"\\n{'='*60}\")\n",
439
- " print(f\"{label}\")\n",
440
  " print(f\"{'='*60}\")\n",
441
  "\n",
442
  " inputs = tokenizer.apply_chat_template(\n",
443
- " messages,\n",
444
- " tokenize=True,\n",
445
- " add_generation_prompt=True,\n",
446
- " enable_thinking=think_mode,\n",
447
- " return_tensors=\"pt\",\n",
448
  " ).to(model.device)\n",
449
  "\n",
450
  " outputs = model.generate(\n",
451
- " input_ids=inputs,\n",
452
- " max_new_tokens=512,\n",
453
- " temperature=0.7,\n",
454
- " top_p=0.9,\n",
455
- " do_sample=True,\n",
456
  " pad_token_id=tokenizer.pad_token_id,\n",
457
  " eos_token_id=tokenizer.eos_token_id,\n",
458
  " )\n",
459
- "\n",
460
- " response = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
461
- " # Extract only the assistant's reply (after the last user turn)\n",
462
- " reply = response.split(\"user\")[-1].split(\"assistant\")[-1].strip()\n",
463
- " print(reply[:800] + (\"...\" if len(reply) > 800 else \"\"))\n",
464
- " print(f\"\\n[Tokens generated: {len(outputs[0]) - len(inputs[0])}]\")"
465
- ]
466
- },
467
- {
468
- "cell_type": "markdown",
469
- "metadata": {},
470
- "source": [
471
- "## πŸ”Ÿ (Bonus) Quick Benchmark – CyberMetric Sample\n",
472
- "\n",
473
- "Test your model's cybersecurity knowledge with a sample from the [CyberMetric benchmark](https://huggingface.co/datasets/cybermetric/cybermetric-500).\n",
474
- "\n",
475
- "This is **not a full evaluation** β€” just a sanity check that your fine-tune improved domain knowledge."
476
- ]
477
- },
478
- {
479
- "cell_type": "code",
480
- "execution_count": null,
481
- "metadata": {},
482
- "outputs": [],
483
- "source": [
484
- "# Sample CyberMetric-style question\n",
485
- "benchmark_q = (\n",
486
- " \"Which of the following is the MOST effective defense against SQL injection attacks?\\n\"\n",
487
- " \"A) Input validation only\\n\"\n",
488
- " \"B) Parameterized queries (prepared statements)\\n\"\n",
489
- " \"C) Escaping special characters\\n\"\n",
490
- " \"D) Client-side filtering\\n\"\n",
491
- " \"Answer with the letter only.\"\n",
492
- ")\n",
493
- "\n",
494
- "bench_msgs = [\n",
495
- " {\"role\": \"system\", \"content\": \"You are a cybersecurity expert. Answer accurately and concisely.\"},\n",
496
- " {\"role\": \"user\", \"content\": benchmark_q},\n",
497
- "]\n",
498
- "\n",
499
- "inputs = tokenizer.apply_chat_template(\n",
500
- " bench_msgs,\n",
501
- " tokenize=True,\n",
502
- " add_generation_prompt=True,\n",
503
- " enable_thinking=False, # fast direct answer\n",
504
- " return_tensors=\"pt\",\n",
505
- ").to(model.device)\n",
506
- "\n",
507
- "outputs = model.generate(\n",
508
- " input_ids=inputs,\n",
509
- " max_new_tokens=64,\n",
510
- " temperature=0.1, # low temp for factual answer\n",
511
- " do_sample=True,\n",
512
- " pad_token_id=tokenizer.pad_token_id,\n",
513
- " eos_token_id=tokenizer.eos_token_id,\n",
514
- ")\n",
515
- "\n",
516
- "answer = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
517
- "print(\"πŸ“Š Benchmark Answer:\")\n",
518
- "print(answer.split(\"assistant\")[-1].strip())"
519
  ]
520
  },
521
  {
@@ -523,29 +509,20 @@
523
  "metadata": {},
524
  "source": [
525
  "---\n",
526
- "## πŸ“š References & Links\n",
527
  "\n",
528
  "| Resource | Link |\n",
529
  "|----------|------|\n",
530
- "| **Model (Qwen3-4B-Instruct-2507)** | https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507 |\n",
531
- "| **Unsloth 4-bit version** | https://huggingface.co/unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit |\n",
532
- "| **Fenrir Dataset** | https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1 |\n",
533
- "| **Trendyol Dataset** | https://huggingface.co/datasets/Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset |\n",
 
 
534
  "| **Unsloth Docs** | https://unsloth.ai/docs |\n",
535
- "| **TRL SFTTrainer** | https://huggingface.co/docs/trl/sft_trainer |\n",
536
- "| **CyberMetric Eval** | https://huggingface.co/datasets/cybermetric/cybermetric-500 |\n",
537
- "\n",
538
- "## πŸ”§ Troubleshooting\n",
539
- "\n",
540
- "| Problem | Solution |\n",
541
- "|---------|----------|\n",
542
- "| `CUDA out of memory` | Lower `MAX_SEQ_LENGTH` to 2048; set `BATCH_SIZE=2`; set `PACKING=False`; enable `use_rslora=True` |\n",
543
- "| Training very slow | Increase `BATCH_SIZE` to 4 if VRAM allows; enable `PACKING=True` |\n",
544
- "| Loss not decreasing | Try `LEARNING_RATE=5e-4` or train for 2 epochs |\n",
545
- "| Can't push to Hub | Run `login(token=...)` with a WRITE token |\n",
546
  "\n",
547
  "---\n",
548
- "*Built with ❀️ for the cybersecurity community. Use responsibly.*"
549
  ]
550
  }
551
  ],
@@ -561,5 +538,5 @@
561
  }
562
  },
563
  "nbformat": 4,
564
- "nbformat_minor": 4
565
  }
 
4
  "cell_type": "markdown",
5
  "metadata": {},
6
  "source": [
7
+ "# πŸ” Ultimate Ethical Hacking / General-Purpose LLM – Colab Free Tier (T4)\n",
8
  "\n",
9
  "**πŸ₯‡ Model:** [Qwen3-4B-Instruct-2507](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) via Unsloth 4-bit \n",
10
+ "**πŸ† Why this model?** Highest coding/reasoning scores among sub-10B models (LiveCodeBench 35.1, MMLU-Pro 69.6). Only **3.3 GB** in 4-bit. \n",
11
+ "**πŸ“Š Datasets:** Your choice β€” pick from cybersecurity, general chat, multilingual, coding, or mix them! \n",
12
  "**⚑ Framework:** Unsloth + TRL SFTTrainer β€” 2Γ— faster, 70% less VRAM \n",
13
  "\n",
14
+ "> ⚠️ **Disclaimer:** Default datasets include **defensive cybersecurity** content (pentesting education, threat analysis, IR). Pick general-purpose datasets for other domains.\n",
 
 
 
 
 
 
 
 
 
 
 
 
 
15
  "\n",
16
  "---\n",
17
  "\n",
 
19
  "\n",
20
  "| Setting | Value | Why |\n",
21
  "|---------|-------|-----|\n",
22
+ "| `MAX_SEQ_LENGTH` | 4096 | Huge headroom on T4 |\n",
23
+ "| `LORA_R` | 64 | Higher rank = more capacity |\n",
24
+ "| `BATCH_SIZE` | 4 | You have ~11GB free VRAM |\n",
25
  "| `GRAD_ACCUM` | 2 | Effective batch = 8 |\n",
26
+ "| `PACKING` | True | 2-3Γ— throughput boost |\n",
27
  "| `optim` | `adamw_8bit` | Massive VRAM saver |\n",
 
28
  "\n",
29
  "If you still hit OOM β†’ lower `MAX_SEQ_LENGTH` to 3072 or set `use_rslora=True`."
30
  ]
 
33
  "cell_type": "markdown",
34
  "metadata": {},
35
  "source": [
36
+ "## 1️⃣ Install Dependencies"
 
 
37
  ]
38
  },
39
  {
 
50
  "cell_type": "markdown",
51
  "metadata": {},
52
  "source": [
53
+ "## 2️⃣ (Optional) Login to HuggingFace Hub"
 
 
 
 
 
54
  ]
55
  },
56
  {
 
67
  "cell_type": "markdown",
68
  "metadata": {},
69
  "source": [
70
+ "## 3️⃣ Load Qwen3-4B-Instruct-2507 in 4-bit via Unsloth"
 
 
 
 
 
71
  ]
72
  },
73
  {
 
80
  "import torch\n",
81
  "\n",
82
  "# ==================== T4-COLAB HYPERPARAMETERS ====================\n",
83
+ "MAX_SEQ_LENGTH = 4096\n",
84
+ "LORA_R = 64\n",
85
+ "LORA_ALPHA = 64\n",
86
+ "BATCH_SIZE = 4\n",
87
+ "GRAD_ACCUM = 2\n",
88
+ "LEARNING_RATE = 2e-4\n",
89
+ "MAX_STEPS = 4000\n",
90
+ "WARMUP_STEPS = 200\n",
91
+ "LOGGING_STEPS = 50\n",
92
+ "SAVE_STEPS = 500\n",
93
+ "PACKING = True\n",
94
+ "SAMPLE_SIZE = 50000\n",
95
+ "HUB_MODEL_ID = \"your-username/cyber-qwen3-4b-lora\"\n",
 
96
  "# ==================================================================\n",
97
  "\n",
98
  "model, tokenizer = FastLanguageModel.from_pretrained(\n",
99
  " model_name=\"unsloth/Qwen3-4B-Instruct-2507-unsloth-bnb-4bit\",\n",
100
  " max_seq_length=MAX_SEQ_LENGTH,\n",
101
+ " dtype=None,\n",
102
  " load_in_4bit=True,\n",
103
  ")\n",
104
  "\n",
 
108
  " target_modules=[\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\",\n",
109
  " \"gate_proj\", \"up_proj\", \"down_proj\"],\n",
110
  " lora_alpha=LORA_ALPHA,\n",
111
+ " lora_dropout=0,\n",
112
  " bias=\"none\",\n",
113
+ " use_gradient_checkpointing=\"unsloth\",\n",
114
  " random_state=3407,\n",
115
+ " use_rslora=False,\n",
116
  " loftq_config=None,\n",
117
  ")\n",
118
  "\n",
119
  "trainable = sum(p.numel() for p in model.parameters() if p.requires_grad)\n",
120
  "total = sum(p.numel() for p in model.parameters())\n",
121
+ "print(f\"βœ… Qwen3-4B loaded. Trainable params: {trainable:,} / {total:,} ({100*trainable/total:.2f}%)\")"
 
 
122
  ]
123
  },
124
  {
125
  "cell_type": "markdown",
126
  "metadata": {},
127
  "source": [
128
+ "## 4️⃣ 🎯 CHOOSE YOUR DATASET(S)\n",
129
+ "\n",
130
+ "Uncomment **ONE** `DATASET_CHOICE` line to select your training data. You can also mix multiple datasets by setting a list.\n",
131
+ "\n",
132
+ "| Choice | Dataset | Size | Format | Best For |\n",
133
+ "|--------|---------|------|--------|----------|\n",
134
+ "| `\"cybersecurity\"` | Fenrir v2.1 + Trendyol | 153K β†’ 50K | system/user/assistant | **Ethical hacking, pentesting education** |\n",
135
+ "| `\"ultrachat\"` | UltraChat 200K (SFT) | 200K β†’ 50K | messages (user/assistant) | General conversation, chatbot tuning |\n",
136
+ "| `\"openhermes\"` | OpenHermes 2.5 | 1M+ β†’ 50K | conversations (human/gpt) | Reasoning, coding, instruction following |\n",
137
+ "| `\"sharegpt_en\"` | ShareGPT English | ~90K β†’ 50K | conversations (human/gpt) | Multi-turn dialogue, general QA |\n",
138
+ "| `\"sharegpt_de\"` | ShareGPT German | ~104K β†’ 50K | conversations (human/gpt) | German language fine-tuning |\n",
139
+ "| `\"sharegpt_hi\"` | ShareGPT Hindi (27B) | ~153K β†’ 50K | conversations (human/gpt) | Hindi language fine-tuning |\n",
140
+ "| `\"custom_mix\"` | Mix of your choice | β€” | varies | Combine datasets for hybrid tuning |\n",
141
  "\n",
142
+ "\n",
143
+ "**To mix datasets**, set `DATASET_CHOICE = \"custom_mix\"` and configure `CUSTOM_DATASETS` below."
144
  ]
145
  },
146
  {
 
150
  "outputs": [],
151
  "source": [
152
  "from datasets import load_dataset, concatenate_datasets\n",
153
+ "\n",
154
+ "# ═══════════════════════════════════════════════════════════════\n",
155
+ "# SELECT YOUR DATASET β€” UNCOMMENT ONE LINE\n",
156
+ "# ═══════════════════════════════════════════════════════════════\n",
157
+ "\n",
158
+ "# --- Option 1: Cybersecurity (default) ---\n",
159
+ "DATASET_CHOICE = \"cybersecurity\"\n",
160
+ "\n",
161
+ "# --- Option 2: General-purpose chat (UltraChat) ---\n",
162
+ "# DATASET_CHOICE = \"ultrachat\"\n",
163
+ "\n",
164
+ "# --- Option 3: Reasoning & coding (OpenHermes 2.5) ---\n",
165
+ "# DATASET_CHOICE = \"openhermes\"\n",
166
+ "\n",
167
+ "# --- Option 4: Multi-turn dialogue (ShareGPT English) ---\n",
168
+ "# DATASET_CHOICE = \"sharegpt_en\"\n",
169
+ "\n",
170
+ "# --- Option 5: German language (ShareGPT German) ---\n",
171
+ "# DATASET_CHOICE = \"sharegpt_de\"\n",
172
+ "\n",
173
+ "# --- Option 6: Hindi language (ShareGPT Hindi 27B) ---\n",
174
+ "# DATASET_CHOICE = \"sharegpt_hi\"\n",
175
+ "\n",
176
+ "# --- Option 7: Mix multiple datasets ---\n",
177
+ "# DATASET_CHOICE = \"custom_mix\"\n",
178
+ "\n",
179
+ "# ═══════════════════════════════════════════════════════════════\n",
180
+ "# CUSTOM MIX CONFIG (only used if DATASET_CHOICE = \"custom_mix\")\n",
181
+ "# ═══════════════════════════════════════════════════════════════\n",
182
+ "CUSTOM_DATASETS = [\n",
183
+ " # (\"dataset_name_or_id\", \"split\", rows_to_take, \"format_type\")\n",
184
+ " # format_type: \"messages\" | \"conversations\" | \"instruction\"\n",
185
+ " (\"AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1\", \"train\", 10000, \"messages\"),\n",
186
+ " (\"HuggingFaceH4/ultrachat_200k\", \"train_sft\", 20000, \"messages\"),\n",
187
+ " (\"teknium/OpenHermes-2.5\", \"train\", 20000, \"conversations\"),\n",
188
+ "]\n",
189
+ "\n",
190
+ "print(f\"🎯 DATASET_CHOICE = {DATASET_CHOICE}\")"
191
+ ]
192
+ },
193
+ {
194
+ "cell_type": "markdown",
195
+ "metadata": {},
196
+ "source": [
197
+ "## 5️⃣ Load, Convert & Pre-process Selected Dataset\n",
198
+ "\n",
199
+ "This cell auto-detects the dataset format and converts everything to standard `messages` β†’ `text` pipeline.\n",
200
+ "**No changes needed** β€” just run it after selecting your dataset above."
201
+ ]
202
+ },
203
+ {
204
+ "cell_type": "code",
205
+ "execution_count": null,
206
+ "metadata": {},
207
+ "outputs": [],
208
+ "source": [
209
  "import random\n",
210
  "\n",
211
+ "def _convert_fenrir(example):\n",
212
+ " return {\"messages\": [\n",
213
+ " {\"role\": \"system\", \"content\": example[\"system\"]},\n",
214
+ " {\"role\": \"user\", \"content\": example[\"user\"]},\n",
215
+ " {\"role\": \"assistant\", \"content\": example[\"assistant\"]},\n",
216
+ " ]}\n",
217
+ "\n",
218
+ "def _convert_trendyol(example):\n",
219
+ " return {\"messages\": [\n",
220
+ " {\"role\": \"system\", \"content\": example[\"system\"]},\n",
221
+ " {\"role\": \"user\", \"content\": example[\"user\"]},\n",
222
+ " {\"role\": \"assistant\", \"content\": example[\"assistant\"]},\n",
223
+ " ]}\n",
224
+ "\n",
225
+ "def _convert_ultrachat(example):\n",
226
+ " # Already in messages format with role/content\n",
227
+ " return {\"messages\": example[\"messages\"]}\n",
228
+ "\n",
229
+ "def _convert_conversations(example):\n",
230
+ " # OpenHermes / ShareGPT style: [{from: 'human'/'gpt', value: '...'}]\n",
231
+ " msgs = []\n",
232
+ " system_prompt = example.get(\"system_prompt\") or example.get(\"system\", \"\")\n",
233
+ " if system_prompt:\n",
234
+ " msgs.append({\"role\": \"system\", \"content\": system_prompt})\n",
235
+ " for turn in example[\"conversations\"]:\n",
236
+ " role = \"user\" if turn[\"from\"] in (\"human\", \"user\") else \"assistant\"\n",
237
+ " msgs.append({\"role\": role, \"content\": turn[\"value\"]})\n",
238
+ " return {\"messages\": msgs}\n",
239
+ "\n",
240
+ "# ===================== LOAD DATASET(S) =====================\n",
241
+ "all_datasets = []\n",
242
+ "\n",
243
+ "if DATASET_CHOICE == \"cybersecurity\":\n",
244
+ " print(\"πŸ“₯ Loading Fenrir v2.1...\")\n",
245
+ " ds1 = load_dataset(\"AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1\", split=\"train\")\n",
246
+ " ds1 = ds1.map(_convert_fenrir, remove_columns=ds1.column_names, batched=False)\n",
247
+ " all_datasets.append(ds1)\n",
248
+ "\n",
249
+ " print(\"πŸ“₯ Loading Trendyol Cybersecurity...\")\n",
250
+ " ds2 = load_dataset(\"Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset\", split=\"train\")\n",
251
+ " ds2 = ds2.map(_convert_trendyol, remove_columns=ds2.column_names, batched=False)\n",
252
+ " all_datasets.append(ds2)\n",
253
+ "\n",
254
+ "elif DATASET_CHOICE == \"ultrachat\":\n",
255
+ " print(\"πŸ“₯ Loading UltraChat 200K (train_sft split)...\")\n",
256
+ " ds = load_dataset(\"HuggingFaceH4/ultrachat_200k\", split=\"train_sft\")\n",
257
+ " ds = ds.map(_convert_ultrachat, remove_columns=ds.column_names, batched=False)\n",
258
+ " all_datasets.append(ds)\n",
259
+ "\n",
260
+ "elif DATASET_CHOICE == \"openhermes\":\n",
261
+ " print(\"πŸ“₯ Loading OpenHermes 2.5...\")\n",
262
+ " ds = load_dataset(\"teknium/OpenHermes-2.5\", split=\"train\")\n",
263
+ " ds = ds.map(_convert_conversations, remove_columns=ds.column_names, batched=False)\n",
264
+ " all_datasets.append(ds)\n",
265
+ "\n",
266
+ "elif DATASET_CHOICE.startswith(\"sharegpt_\"):\n",
267
+ " split_map = {\"sharegpt_en\": \"english\", \"sharegpt_de\": \"german_4b_translated\", \"sharegpt_hi\": \"hindi_27b_translated\"}\n",
268
+ " split_name = split_map[DATASET_CHOICE]\n",
269
+ " print(f\"πŸ“₯ Loading ShareGPT multilingual ({split_name})...\")\n",
270
+ " ds = load_dataset(\"deepmage121/ShareGPT_multilingual\", split=split_name)\n",
271
+ " ds = ds.map(_convert_conversations, remove_columns=ds.column_names, batched=False)\n",
272
+ " all_datasets.append(ds)\n",
273
+ "\n",
274
+ "elif DATASET_CHOICE == \"custom_mix\":\n",
275
+ " for ds_id, split, n_rows, fmt in CUSTOM_DATASETS:\n",
276
+ " print(f\"πŸ“₯ Loading {ds_id} ({split}, {n_rows} rows)...\")\n",
277
+ " ds = load_dataset(ds_id, split=split)\n",
278
+ " if n_rows and len(ds) > n_rows:\n",
279
+ " ds = ds.shuffle(seed=3407).select(range(n_rows))\n",
280
+ " if fmt == \"messages\":\n",
281
+ " ds = ds.map(_convert_ultrachat, remove_columns=ds.column_names, batched=False)\n",
282
+ " elif fmt == \"conversations\":\n",
283
+ " ds = ds.map(_convert_conversations, remove_columns=ds.column_names, batched=False)\n",
284
+ " else:\n",
285
+ " raise ValueError(f\"Unknown format: {fmt}\")\n",
286
+ " all_datasets.append(ds)\n",
287
+ "\n",
288
+ "else:\n",
289
+ " raise ValueError(f\"Unknown DATASET_CHOICE: {DATASET_CHOICE}\")\n",
290
+ "\n",
291
+ "# Merge all loaded datasets\n",
292
+ "if len(all_datasets) == 1:\n",
293
+ " train_dataset = all_datasets[0]\n",
294
+ "else:\n",
295
+ " train_dataset = concatenate_datasets(all_datasets)\n",
296
+ "\n",
297
  "print(f\"\\nπŸ“Š COMBINED DATASET: {len(train_dataset)} rows\")\n",
298
  "\n",
299
+ "# Show a random sample\n",
300
+ "sample = train_dataset[random.randint(0, len(train_dataset)-1)]\n",
301
+ "print(f\"\\n--- Random sample roles: {[m['role'] for m in sample['messages']]} ---\")\n",
302
+ "for m in sample[\"messages\"]:\n",
303
+ " print(f\" {m['role']}: {m['content'][:100]}...\")\n",
304
+ "\n",
305
+ "# Subsample for speed\n",
306
  "if len(train_dataset) > SAMPLE_SIZE:\n",
307
  " train_dataset = train_dataset.shuffle(seed=3407).select(range(SAMPLE_SIZE))\n",
308
+ " print(f\"\\nπŸš€ SUBSAMPLED to {len(train_dataset)} rows\")\n",
 
 
309
  "\n",
310
  "print(f\" Effective batch size: {BATCH_SIZE * GRAD_ACCUM}\")\n",
311
  "print(f\" Steps per epoch: ~{len(train_dataset) // (BATCH_SIZE * GRAD_ACCUM)}\")\n",
 
316
  "cell_type": "markdown",
317
  "metadata": {},
318
  "source": [
319
+ "## 6️⃣ Convert Messages β†’ Text (Chat Template)\n",
320
  "\n",
321
+ "Uses `tokenizer.apply_chat_template` to convert structured messages into training text. No `formatting_func` needed."
 
 
322
  ]
323
  },
324
  {
 
327
  "metadata": {},
328
  "outputs": [],
329
  "source": [
 
330
  "def convert_messages_to_text(examples):\n",
 
 
 
 
331
  " texts = []\n",
332
  " for msgs in examples[\"messages\"]:\n",
333
  " text = tokenizer.apply_chat_template(\n",
334
  " msgs,\n",
335
+ " tokenize=False,\n",
336
+ " add_generation_prompt=False,\n",
337
  " )\n",
338
  " texts.append(text)\n",
339
  " return {\"text\": texts}\n",
340
  "\n",
341
+ "print(\"πŸ”„ Converting messages to text...\")\n",
342
  "train_dataset = train_dataset.map(\n",
343
  " convert_messages_to_text,\n",
344
+ " batched=True,\n",
345
+ " remove_columns=[\"messages\"],\n",
346
+ " batch_size=100,\n",
347
  ")\n",
348
  "\n",
349
  "print(f\"βœ… Dataset pre-processed. Columns: {train_dataset.column_names}\")\n",
 
355
  "cell_type": "markdown",
356
  "metadata": {},
357
  "source": [
358
+ "## 7️⃣ Configure SFT Trainer"
359
  ]
360
  },
361
  {
 
371
  " model=model,\n",
372
  " tokenizer=tokenizer,\n",
373
  " train_dataset=train_dataset,\n",
374
+ " dataset_text_field=\"text\",\n",
375
  " max_seq_length=MAX_SEQ_LENGTH,\n",
376
+ " dataset_num_proc=2,\n",
377
+ " packing=PACKING,\n",
378
  " args=TrainingArguments(\n",
379
  " per_device_train_batch_size=BATCH_SIZE,\n",
380
  " gradient_accumulation_steps=GRAD_ACCUM,\n",
381
  " warmup_steps=WARMUP_STEPS,\n",
382
+ " max_steps=MAX_STEPS,\n",
 
383
  " learning_rate=LEARNING_RATE,\n",
384
+ " fp16=True,\n",
385
  " logging_steps=LOGGING_STEPS,\n",
386
+ " optim=\"adamw_8bit\",\n",
387
  " weight_decay=0.01,\n",
388
  " lr_scheduler_type=\"linear\",\n",
389
  " seed=3407,\n",
390
  " output_dir=\"./outputs\",\n",
391
  " save_strategy=\"steps\",\n",
392
  " save_steps=SAVE_STEPS,\n",
393
+ " save_total_limit=2,\n",
394
+ " report_to=\"none\",\n",
 
 
 
395
  " ),\n",
396
  ")\n",
397
  "\n",
398
+ "print(f\"βœ… Trainer ready. Dataset: {DATASET_CHOICE} | Steps: {MAX_STEPS}\")\n",
399
  "print(f\" Effective batch size: {BATCH_SIZE * GRAD_ACCUM}\")\n",
400
+ "print(f\" Packing enabled: {PACKING}\")"
 
 
401
  ]
402
  },
403
  {
404
  "cell_type": "markdown",
405
  "metadata": {},
406
  "source": [
407
+ "## 8️⃣ Train πŸš€"
 
 
 
 
 
 
 
 
408
  ]
409
  },
410
  {
 
413
  "metadata": {},
414
  "outputs": [],
415
  "source": [
 
416
  "if torch.cuda.is_available():\n",
417
+ " print(f\"VRAM before train: {torch.cuda.memory_allocated()/1e9:.2f} GB\")\n",
418
  "\n",
419
  "trainer_stats = trainer.train()\n",
420
  "\n",
 
429
  "cell_type": "markdown",
430
  "metadata": {},
431
  "source": [
432
+ "## 9️⃣ Save & Push to HuggingFace Hub"
 
 
 
 
 
 
433
  ]
434
  },
435
  {
 
438
  "metadata": {},
439
  "outputs": [],
440
  "source": [
441
+ "# Save LoRA adapter (tiny, ~50-100 MB)\n",
442
+ "model.save_pretrained(\"./lora-adapter\")\n",
443
+ "tokenizer.save_pretrained(\"./lora-adapter\")\n",
444
+ "print(\"βœ… LoRA adapter saved\")\n",
445
+ "\n",
446
+ "# Merge & save full 16-bit model (~8 GB)\n",
447
+ "print(\"\\nπŸ”„ Merging LoRA into base model...\")\n",
 
448
  "merged_model = model.merge_and_unload()\n",
449
+ "merged_model.save_pretrained(\"./merged-model\")\n",
450
+ "tokenizer.save_pretrained(\"./merged-model\")\n",
451
+ "print(\"βœ… Merged model saved\")\n",
452
  "\n",
453
+ "# Push to HF Hub (uncomment if logged in)\n",
454
  "# model.push_to_hub(HUB_MODEL_ID)\n",
455
+ "# tokenizer.push_to_hub(HUB_MODEL_ID)"
 
456
  ]
457
  },
458
  {
459
  "cell_type": "markdown",
460
  "metadata": {},
461
  "source": [
462
+ "## πŸ”Ÿ Inference Demo – Qwen3 Thinking Toggle\n",
 
 
463
  "\n",
464
  "| Mode | Use Case | Speed |\n",
465
  "|------|----------|-------|\n",
466
+ "| `enable_thinking=True` | Deep reasoning, analysis, chain-of-thought | Slower, thorough |\n",
467
+ "| `enable_thinking=False` | Quick answers, coding snippets, commands | Fast, direct |"
 
 
468
  ]
469
  },
470
  {
 
473
  "metadata": {},
474
  "outputs": [],
475
  "source": [
476
+ "FastLanguageModel.for_inference(model)\n",
477
  "\n",
478
+ "test_prompt = \"Explain how parameterized queries prevent SQL injection, with a Python example.\"\n",
 
 
 
479
  "\n",
480
  "messages = [\n",
481
+ " {\"role\": \"system\", \"content\": \"You are a helpful and knowledgeable assistant.\"},\n",
482
  " {\"role\": \"user\", \"content\": test_prompt},\n",
483
  "]\n",
484
  "\n",
485
  "for think_mode in [True, False]:\n",
486
+ " label = \"🧠 THINKING=ON\" if think_mode else \"⚑ THINKING=OFF\"\n",
487
  " print(f\"\\n{'='*60}\")\n",
488
+ " print(label)\n",
489
  " print(f\"{'='*60}\")\n",
490
  "\n",
491
  " inputs = tokenizer.apply_chat_template(\n",
492
+ " messages, tokenize=True, add_generation_prompt=True,\n",
493
+ " enable_thinking=think_mode, return_tensors=\"pt\",\n",
 
 
 
494
  " ).to(model.device)\n",
495
  "\n",
496
  " outputs = model.generate(\n",
497
+ " input_ids=inputs, max_new_tokens=512, temperature=0.7,\n",
498
+ " top_p=0.9, do_sample=True,\n",
 
 
 
499
  " pad_token_id=tokenizer.pad_token_id,\n",
500
  " eos_token_id=tokenizer.eos_token_id,\n",
501
  " )\n",
502
+ " reply = tokenizer.decode(outputs[0], skip_special_tokens=True)\n",
503
+ " print(reply.split(\"assistant\")[-1].strip()[:800])\n",
504
+ " print(f\"\\n[Tokens: {len(outputs[0]) - len(inputs[0])}]\")"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
505
  ]
506
  },
507
  {
 
509
  "metadata": {},
510
  "source": [
511
  "---\n",
512
+ "## πŸ“š Dataset & Model References\n",
513
  "\n",
514
  "| Resource | Link |\n",
515
  "|----------|------|\n",
516
+ "| **Qwen3-4B-Instruct-2507** | https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507 |\n",
517
+ "| **UltraChat 200K** | https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k |\n",
518
+ "| **OpenHermes 2.5** | https://huggingface.co/datasets/teknium/OpenHermes-2.5 |\n",
519
+ "| **ShareGPT Multilingual** | https://huggingface.co/datasets/deepmage121/ShareGPT_multilingual |\n",
520
+ "| **Fenrir Cybersecurity** | https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1 |\n",
521
+ "| **Trendyol Cybersecurity** | https://huggingface.co/datasets/Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset |\n",
522
  "| **Unsloth Docs** | https://unsloth.ai/docs |\n",
 
 
 
 
 
 
 
 
 
 
 
523
  "\n",
524
  "---\n",
525
+ "*Pick any dataset. Train anything. Use responsibly.*"
526
  ]
527
  }
528
  ],
 
538
  }
539
  },
540
  "nbformat": 4,
541
+ "nbformat_minor": 4
542
  }