π General-Purpose LLM Fine-Tuning Collection β Google Colab Free Tier (T4)
A curated collection of production-ready Colab notebooks for fine-tuning state-of-the-art small LLMs on any domain using Google Colab Free Tier (T4, 16GB VRAM).
Pick your model, pick your dataset, click run. Zero-config fine-tuning.
π Notebooks
| Notebook | Model | Size | T4 Batch | Est. Time | Status |
|---|---|---|---|---|---|
| Qwen3-4B Ultimate | unsloth/Qwen3-4B-Instruct-2507 |
3.3GB 4-bit | 4 | ~3β4 hrs | β Recommended |
| LFM2.5 Ultimate | unsloth/LFM2.5-1.2B-Instruct |
~1GB 4-bit | 8 | ~1β2 hrs | β Fastest |
| Gemma-4 E2B | unsloth/gemma-4-E2B-it |
~7.6GB 4-bit | 1 | ~6β8 hrs | β οΈ Tight VRAM |
| Bonsai (PrismML) | See limitations | ~1GB 1-bit | N/A | N/A | β Not supported |
π₯ Model Comparison (May 2026)
| Model | Params | 4-bit Size | VRAM Fit | Batch | MMLU-Pro | LiveCodeBench | Context | Notes |
|---|---|---|---|---|---|---|---|---|
| Qwen3-4B | 4B | 3.3 GB | Easy (12GB free) | 4 | 69.6 | 35.1 | 32K | Best coding/reasoning. Thinking toggle. |
| LFM2.5-1.2B | 1.2B | ~1 GB | Huge headroom | 8 | β | β | 128K | Fastest training. Liquid AI edge model. |
| Gemma-4 E2B | ~2B dense | 7.6 GB | Tight (8GB free) | 1 | β | β | 256K | Dense (not MoE). Google edge model. |
| Bonsai-8B | 8B | ~1 GB packed | N/A | N/A | ~30 | β | β | 1-bit ternary. Cannot train with Unsloth. |
Recommendation: Start with Qwen3-4B for best accuracy, or LFM2.5 for fastest experimentation.
π Dataset Selection β 8 Built-in Choices
Every notebook includes a DATASET_CHOICE variable. Just uncomment one line to pick your data.
| Choice | Dataset | Rows | Format | Best For | Language |
|---|---|---|---|---|---|
cybersecurity |
Fenrir v2.1 + Trendyol | 153Kβ50K | system/user/assistant | Ethical hacking, pentesting education | English |
ultrachat |
UltraChat 200K (SFT) | 200Kβ50K | messages (role/content) | General conversation, chatbot tuning | English |
openhermes |
OpenHermes 2.5 | 1M+β50K | conversations (human/gpt) | Reasoning, coding, instruction following | English |
sharegpt_en |
ShareGPT (English) | ~90Kβ50K | conversations (human/gpt) | Multi-turn dialogue, general QA | English |
sharegpt_de |
ShareGPT (German) | ~104Kβ50K | conversations (human/gpt) | German language fine-tuning | German |
sharegpt_hi |
ShareGPT (Hindi 27B) | ~153Kβ50K | conversations (human/gpt) | Hindi language fine-tuning | Hindi |
code_corpus |
Code Corpus LLM Training | 240Kβ50K | text (code files with domain/repo/lang metadata) | Code completion, coding assistant | Multi (20 domains: Rust, Python, C++, Kotlin, Flutter, game engines, web frameworks, ethical hacking repos, etc.) |
custom_mix |
Your combination | β | varies | Combine datasets for hybrid tuning | Mixed |
How to Switch Datasets (in any notebook)
# In Cell 4 β uncomment ONE line:
DATASET_CHOICE = "cybersecurity" # β Default (defensive security)
# DATASET_CHOICE = "ultrachat" # β General chat
# DATASET_CHOICE = "openhermes" # β Reasoning & coding
# DATASET_CHOICE = "sharegpt_en" # β English dialogue
# DATASET_CHOICE = "sharegpt_de" # β German
# DATASET_CHOICE = "sharegpt_hi" # β Hindi
# DATASET_CHOICE = "code_corpus" # β Code completion (Rust, Python, C++, etc.)
# DATASET_CHOICE = "custom_mix" # β Mix multiple
Code Corpus Dataset Details
The Code Corpus LLM Training dataset contains 240,378 code files from top open-source repositories across 20 domains:
| Domain | Examples |
|---|---|
web_ui |
Web frameworks, UI components |
cpp |
C++ systems code |
kotlin_android |
Android apps |
rust |
Rust systems (e.g., actix-web) |
python |
Python libraries |
ethical_hacking |
Security tools, pentesting repos |
game_engines |
Game development |
ui_ux_design |
Design systems |
Each example has: text (the full code file), domain, repo, language, file_path, size_chars. The notebook converts each code snippet into a user/assistant conversation: user asks to explain/improve the code, assistant provides the code.
Mixing Datasets (custom_mix)
CUSTOM_DATASETS = [
# (dataset_id, split, num_rows, format_type)
# format_type: "messages" | "conversations" | "text"
("AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1", "train", 10000, "messages"),
("krystv/code-corpus-llm-training", "train", 20000, "text"),
("teknium/OpenHermes-2.5", "train", 20000, "conversations"),
]
π How to Use (Any Notebook)
- Open the notebook in Google Colab (click the notebook link above)
- Runtime β Change runtime type β T4 GPU
- In Cell 4, uncomment your desired
DATASET_CHOICE - Run cells top-to-bottom
- (Optional) Set your HF token in cell 2 to push the LoRA adapter
- The last cells show inference demos
Zero-config: All hyperparameters are tuned for T4. Just pick a dataset and click βΆοΈ.
π§ Technical: Why dataset_text_field="text"?
Unsloth's SFTTrainer has issues with formatting_func. The clean fix:
# Pre-convert messages β text using dataset.map(batched=True)
def convert_messages_to_text(examples):
texts = []
for msgs in examples["messages"]:
text = tokenizer.apply_chat_template(msgs, tokenize=False)
texts.append(text)
return {"text": texts}
train_dataset = train_dataset.map(convert_messages_to_text, batched=True, remove_columns=["messages"])
# Then pass dataset_text_field="text" to SFTTrainer
trainer = SFTTrainer(..., dataset_text_field="text")
All notebooks handle format auto-detection (Fenrir, UltraChat, OpenHermes, ShareGPT, Code Corpus) automatically.
β οΈ T4 VRAM Cheat-Sheet
| Symptom | Fix |
|---|---|
CUDA out of memory |
Lower MAX_SEQ_LENGTH to 2048; set BATCH_SIZE=1; set PACKING=False |
| Still OOM | Enable use_rslora=True in LoRA config |
| Training very slow | Increase BATCH_SIZE if VRAM allows; enable PACKING=True |
| Loss not decreasing | Try LEARNING_RATE=5e-4 or train for 2 epochs |
| Can't push to Hub | Run login(token=...) with a WRITE token |
π References
| Resource | Link |
|---|---|
| Qwen3-4B-Instruct-2507 | https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507 |
| LFM2.5-1.2B-Instruct | https://huggingface.co/LiquidAI/LFM2.5-1.2B-Instruct |
| Gemma 4 E2B | https://huggingface.co/google/gemma-4-E2B-it |
| Unsloth Docs | https://unsloth.ai/docs |
| UltraChat 200K | https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k |
| OpenHermes 2.5 | https://huggingface.co/datasets/teknium/OpenHermes-2.5 |
| ShareGPT Multilingual | https://huggingface.co/datasets/deepmage121/ShareGPT_multilingual |
| Code Corpus LLM Training | https://huggingface.co/datasets/krystv/code-corpus-llm-training |
| Fenrir Cybersecurity | https://huggingface.co/datasets/AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.1 |
| Trendyol Cybersecurity | https://huggingface.co/datasets/Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset |
π Repository Structure
asdf98/ethical-hacking-llm-colab/
βββ EthicalHacking_Qwen3-4B_Ultimate_Colab.ipynb β Best accuracy
βββ EthicalHacking_LFM2.5_Ultimate_Colab.ipynb β Fastest training
βββ EthicalHacking_Gemma4_E2B_Colab.ipynb β Google model (tight VRAM)
βββ EthicalHacking_Qwen3-8B_Colab.ipynb β Simpler backup (8B)
βββ EthicalHacking_MultiModel_Comparison_Colab.ipynb β Compare models
βββ BONSAI_LIMITATIONS.md β Why Bonsai can't be fine-tuned
βββ README.md β This file
Pick any dataset. Train anything. Use responsibly.