{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# 🛡️ Murshid Backend — Full Mode on Colab\n", "\n", "**مُرشِد | From Alerts to Guidance: MITRE ATT&CK-Aligned Techniques Mapping for SOC Analysts**\n", "\n", "---\n", "\n", "## 📁 الملفات المطلوبة على Google Drive\n", "\n", "```\n", "MyDrive/\n", "├── murshid_backend_for_drive.zip ← ارفعيه ثم شغّلي الخلية 2b لاستخراجه\n", "│ أو\n", "├── murshid_backend/ ← إذا استخرجته مسبقاً\n", "│ ├── app/\n", "│ ├── alembic/\n", "│ ├── scripts/\n", "│ ├── alembic.ini\n", "│ └── requirements.txt\n", "│\n", "└── Needed/\n", " ├── murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n", " ├── murshid_logreg_thresholds_manual_oof_pcatuned.npy\n", " ├── murshid_label_columns.json\n", " └── murshid_query_template_structure_clean_shared.xlsx\n", "```\n", "\n", "## تعليمات التشغيل\n", "\n", "### المتطلبات قبل التشغيل\n", "1. ✅ **GPU مُفعَّل:** `Runtime → Change runtime type → T4 GPU`\n", "2. ✅ **Google Drive مُتَّصل** (يحتوي مجلد `Needed` بملفات النماذج)\n", "3. ✅ **مجلد `murshid_backend`** على Drive أو رفعه يدوياً\n", "\n", "### الملفات المطلوبة في Google Drive\n", "```\n", "MyDrive/\n", "├── Needed/\n", "│ ├── murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n", "│ ├── murshid_logreg_thresholds_manual_oof_pcatuned.npy\n", "│ ├── murshid_label_columns.json\n", "│ └── murshid_query_template_structure_clean_shared.xlsx\n", "└── murshid_backend/ ← مجلد الباكند كاملاً\n", "```\n", "\n", "### ترتيب التشغيل\n", "**شغّلي الخلايا بالترتيب من الأعلى للأسفل — لا تتخطّي أي خلية**\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 1: التحقق من GPU\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import torch\n", "\n", "print('CUDA available:', torch.cuda.is_available())\n", "if torch.cuda.is_available():\n", " print('GPU:', torch.cuda.get_device_name(0))\n", " print('Memory:', round(torch.cuda.get_device_properties(0).total_memory / 1e9, 1), 'GB')\n", "else:\n", " print('⚠️ لا يوجد GPU — غيّري Runtime إلى T4 من القائمة أعلاه')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 2: تحميل Google Drive\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 3: تجهيز الباكند في /content\n", "\n", "> تقوم هذه الخلية تلقائياً بـ:\n", "> 1. استخراج ZIP من Drive (إذا كان ZIP موجوداً ولم يُستخرج بعد)\n", "> 2. نسخ مجلد `murshid_backend` إلى `/content` (أسرع للقراءة)\n", "> 3. ضبط Python path\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('(هذه الخلية فارغة — الكود انتقل إلى الخلية 3 أدناه)')\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "from google.colab import drive\n", "import os\n", "\n", "drive.mount('/content/drive')\n", "\n", "# ✏️ عدّلي هذا المسار إذا كان مجلدك مختلفاً\n", "NEEDED_PATH = '/content/drive/MyDrive/Needed'\n", "BACKEND_PATH = '/content/drive/MyDrive/murshid_backend'\n", "ZIP_PATH = '/content/drive/MyDrive/murshid_backend_for_drive.zip'\n", "\n", "print('=' * 55)\n", "print('📂 Checking Google Drive files...')\n", "print('=' * 55)\n", "\n", "# ── التحقق من ملفات Needed ────────────────────────────────────\n", "print('\\n📁 Needed/ (model files):')\n", "required_files = {\n", " 'murshid_logreg_pipeline_manual_oof_pcatuned.joblib': 'LogReg model',\n", " 'murshid_logreg_thresholds_manual_oof_pcatuned.npy': 'LogReg thresholds',\n", " 'murshid_label_columns.json': 'Technique names',\n", "}\n", "\n", "models_ok = True\n", "for fname, desc in required_files.items():\n", " path = f'{NEEDED_PATH}/{fname}'\n", " exists = os.path.isfile(path)\n", " size = f'{os.path.getsize(path)/1024:.0f} KB' if exists else ''\n", " status = '✅' if exists else '❌'\n", " print(f' {status} {fname} {size}')\n", " if not exists:\n", " models_ok = False\n", "\n", "excel_path = f'{NEEDED_PATH}/murshid_query_template_structure_clean_shared.xlsx'\n", "excel_ok = os.path.isfile(excel_path)\n", "print(f' {\"✅\" if excel_ok else \"⚠️ \"} murshid_query_template_structure_clean_shared.xlsx (optional)')\n", "\n", "# ── التحقق من الباكند ─────────────────────────────────────────\n", "print('\\n📁 murshid_backend/ (backend code):')\n", "backend_ok = os.path.isdir(BACKEND_PATH)\n", "zip_ok = os.path.isfile(ZIP_PATH)\n", "\n", "if backend_ok:\n", " fcount = sum(len(f) for _, _, f in os.walk(BACKEND_PATH))\n", " print(f' ✅ murshid_backend/ ({fcount} files)')\n", "elif zip_ok:\n", " zsize = f'{os.path.getsize(ZIP_PATH)/1024:.0f} KB'\n", " print(f' 📦 murshid_backend_for_drive.zip ({zsize}) — سيُستخرج تلقائياً في الخلية 3')\n", "else:\n", " print(f' ❌ murshid_backend/ غير موجود')\n", " print(f' ❌ murshid_backend_for_drive.zip غير موجود')\n", " print(f'\\n ⚠️ ارفعي murshid_backend_for_drive.zip إلى:')\n", " print(f' Google Drive → My Drive')\n", "\n", "# ── ملخص ──────────────────────────────────────────────────────\n", "print('\\n' + '=' * 55)\n", "if models_ok and (backend_ok or zip_ok):\n", " print('✅ كل شيء جاهز — تابعي تشغيل الخلايا')\n", "elif not models_ok:\n", " print('❌ ملفات النماذج مفقودة من Needed/ — يجب رفعها أولاً')\n", "else:\n", " print('❌ ملفات الباكند مفقودة — ارفعي ZIP أولاً')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 3: نسخ الباكند إلى /content\n", "\n", "> نسخ الملفات من Drive إلى `/content` لتسريع القراءة\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import shutil, os, zipfile, sys\n", "\n", "DRIVE_BASE = '/content/drive/MyDrive'\n", "ZIP_PATH = f'{DRIVE_BASE}/murshid_backend_for_drive.zip'\n", "BACKEND_DRIVE= f'{DRIVE_BASE}/murshid_backend'\n", "BACKEND_LOCAL= '/content/murshid_backend'\n", "\n", "# ── الخطوة 1: استخراج ZIP من Drive إذا لزم ────────────────────\n", "if not os.path.isdir(BACKEND_DRIVE):\n", " if os.path.isfile(ZIP_PATH):\n", " print(f'📦 ZIP found — extracting to Drive...')\n", " with zipfile.ZipFile(ZIP_PATH, 'r') as z:\n", " z.extractall(DRIVE_BASE)\n", " print(f'✅ Extracted to {BACKEND_DRIVE}')\n", " else:\n", " print('❌ ERROR: مجلد murshid_backend غير موجود على Drive')\n", " print(f' المطلوب: {BACKEND_DRIVE}')\n", " print(f' أو رفع: {ZIP_PATH}')\n", " raise FileNotFoundError(f'Backend not found. Upload murshid_backend_for_drive.zip to Google Drive MyDrive.')\n", "else:\n", " print(f'✅ murshid_backend found on Drive: {BACKEND_DRIVE}')\n", "\n", "# ── الخطوة 2: نسخ إلى /content (أسرع بكثير من Drive أثناء التشغيل) ─\n", "if os.path.exists(BACKEND_LOCAL):\n", " shutil.rmtree(BACKEND_LOCAL)\n", "\n", "shutil.copytree(\n", " BACKEND_DRIVE,\n", " BACKEND_LOCAL,\n", " ignore=shutil.ignore_patterns('__pycache__', '*.pyc', '.venv', '*.db', '*.log')\n", ")\n", "\n", "# ── الخطوة 3: إضافة للـ Python path ──────────────────────────\n", "if BACKEND_LOCAL not in sys.path:\n", " sys.path.insert(0, BACKEND_LOCAL)\n", "\n", "os.chdir(BACKEND_LOCAL)\n", "\n", "# ── تحقق ─────────────────────────────────────────────────────\n", "file_count = sum(len(files) for _, _, files in os.walk(BACKEND_LOCAL))\n", "print(f'✅ Backend ready at {BACKEND_LOCAL} ({file_count} files)')\n", "print(f'✅ Working dir: {os.getcwd()}')\n", "\n", "# عرض الهيكل\n", "print('\\nStructure:')\n", "for item in sorted(os.listdir(BACKEND_LOCAL)):\n", " full = os.path.join(BACKEND_LOCAL, item)\n", " if os.path.isdir(full):\n", " sub_count = len(os.listdir(full))\n", " print(f' 📁 {item}/ ({sub_count} items)')\n", " else:\n", " size = os.path.getsize(full)\n", " print(f' 📄 {item} ({size:,} bytes)')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 4: تثبيت المتطلبات\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print('📦 Installing requirements...')\n", "\n", "# ── الحزم الأساسية للباكند ──────────────────────────────────────\n", "!pip install -q \\\n", " fastapi==0.115.0 \\\n", " \"uvicorn[standard]==0.32.0\" \\\n", " pydantic==2.9.0 \\\n", " pydantic-settings==2.6.0 \\\n", " python-dotenv==1.0.0 \\\n", " sqlalchemy==2.0.0 \\\n", " alembic==1.13.0 \\\n", " aiofiles \\\n", " scikit-learn==1.6.1 \\\n", " joblib \\\n", " lxml \\\n", " openpyxl \\\n", " nest-asyncio \\\n", " pyngrok\n", "\n", "# ── bitsandbytes: مطلوب لتحميل LLaMA بـ 4-bit على GPU ─────────\n", "print('📦 Installing bitsandbytes (required for LLaMA 4-bit)...')\n", "!pip install -q -U \"bitsandbytes>=0.46.1\"\n", "\n", "# ── accelerate: مطلوب لـ device_map=\"auto\" ────────────────────\n", "!pip install -q -U accelerate\n", "\n", "# ── تحقق من التثبيت ──────────────────────────────────────────\n", "import importlib\n", "for pkg in ['bitsandbytes', 'accelerate', 'fastapi', 'sklearn']:\n", " try:\n", " mod = importlib.import_module(pkg if pkg != 'sklearn' else 'sklearn')\n", " ver = getattr(mod, '__version__', '?')\n", " print(f' ✅ {pkg}=={ver}')\n", " except ImportError:\n", " print(f' ❌ {pkg} — فشل التثبيت')\n", "\n", "print('\\n✅ All requirements installed')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 5: إعداد ملف .env\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import os\n", "\n", "# ✏️ ضعي HF Token هنا إذا لم تُضيفيه عبر Colab Secrets\n", "HF_TOKEN = os.environ.get('HF_TOKEN', 'ادخل التوكن')\n", "\n", "env_content = f\"\"\"# Auto-generated .env for Colab FULL mode\n", "MURSHID_DB_URL=sqlite:////content/murshid.db\n", "MURSHID_MODELS_DIR={NEEDED_PATH}\n", "HF_TOKEN={HF_TOKEN}\n", "MURSHID_SKIP_LLM=false\n", "SECRET_KEY=murshid_colab_2026\n", "LLAMA_MODEL_ID=meta-llama/Meta-Llama-3-8B-Instruct\n", "EMBED_MODEL_ID=ehsanaghaei/SecureBERT_Plus\n", "LOGREG_JOBLIB=murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n", "LOGREG_THRESHOLDS_NPY=murshid_logreg_thresholds_manual_oof_pcatuned.npy\n", "LABEL_COLUMNS_JSON=murshid_label_columns.json\n", "\"\"\"\n", "\n", "env_path = '/content/murshid_backend/.env'\n", "with open(env_path, 'w') as f:\n", " f.write(env_content)\n", "\n", "print('✅ .env created at', env_path)\n", "print('\\nContents:')\n", "with open(env_path) as f:\n", " for line in f:\n", " if 'TOKEN' in line or 'SECRET' in line:\n", " key = line.split('=')[0]\n", " print(f' {key}=****')\n", " else:\n", " print(' ', line.rstrip())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 6: تهجير قاعدة البيانات (Alembic)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess, os\n", "\n", "os.chdir('/content/murshid_backend')\n", "\n", "result = subprocess.run(\n", " ['python', '-m', 'alembic', 'upgrade', 'head'],\n", " capture_output=True, text=True\n", ")\n", "\n", "print(result.stdout)\n", "if result.stderr:\n", " print(result.stderr)\n", "\n", "import os\n", "db_exists = os.path.isfile('/content/murshid.db')\n", "print('✅ Database ready:', '/content/murshid.db' if db_exists else '❌ لم يُنشأ')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 7: استيراد قوالب WQL من Excel\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import sys\n", "sys.path.insert(0, '/content/murshid_backend')\n", "os.chdir('/content/murshid_backend')\n", "\n", "excel_path = f'{NEEDED_PATH}/murshid_query_template_structure_clean_shared.xlsx'\n", "\n", "if os.path.isfile(excel_path):\n", " from app.db.session import SessionLocal\n", " from scripts.import_excel_templates import run as import_excel\n", "\n", " db = SessionLocal()\n", " try:\n", " result = import_excel(db, replace=False)\n", " print('✅ Excel import result:')\n", " for k, v in result.items():\n", " print(f' {k}: {v}')\n", " finally:\n", " db.close()\n", "else:\n", " print(f'⚠️ Excel file not found at: {excel_path}')\n", " print(' يمكنك المتابعة — القوالب ستُضاف لاحقاً يدوياً')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 8: تشغيل FastAPI + ngrok\n", "\n", "> ⏳ هذه الخلية تأخذ **5-10 دقائق** لتحميل LLaMA (4.5GB) و SecureBERT+\n", "\n", "> 🔑 **الرابط العام سيظهر في النهاية** — انسخيه للفرونت\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess, time, os, sys, urllib.request\n", "import nest_asyncio\n", "nest_asyncio.apply()\n", "\n", "os.chdir('/content/murshid_backend')\n", "\n", "# ─── التحقق من bitsandbytes قبل تشغيل الخادم ─────────────────\n", "try:\n", " import bitsandbytes as bnb\n", " print(f'✅ bitsandbytes {bnb.__version__}')\n", "except ImportError:\n", " print('❌ bitsandbytes غير مثبّت — شغّلي الخلية 4 أولاً')\n", " raise\n", "\n", "# ─── تشغيل uvicorn ───────────────────────────────────────────\n", "log_path = '/content/murshid_server.log'\n", "log_file = open(log_path, 'w')\n", "\n", "server_proc = subprocess.Popen(\n", " [\n", " 'python', '-m', 'uvicorn', 'app.main:app',\n", " '--host', '0.0.0.0',\n", " '--port', '8000',\n", " '--log-level', 'info'\n", " ],\n", " cwd='/content/murshid_backend',\n", " stdout=log_file,\n", " stderr=subprocess.STDOUT\n", ")\n", "\n", "print('⏳ Loading LLaMA 3 8B + SecureBERT+...')\n", "print(' جاري التحميل — انتظري حتى تظهر الرسالة النهائية')\n", "\n", "# ─── انتظار ذكي مع عرض اللوج ────────────────────────────────\n", "started = False\n", "last_log_size = 0\n", "\n", "for i in range(180): # 15 دقيقة كحد أقصى\n", " time.sleep(5)\n", "\n", " # تحقق إذا بدأ الخادم\n", " try:\n", " resp = urllib.request.urlopen('http://localhost:8000/health', timeout=3)\n", " if resp.status == 200:\n", " started = True\n", " break\n", " except Exception:\n", " pass\n", "\n", " # عرض اللوج الجديد كل 30 ثانية\n", " if i % 6 == 0:\n", " elapsed = (i + 1) * 5\n", " log_file.flush()\n", " try:\n", " with open(log_path) as f:\n", " log_content = f.read()\n", " new_content = log_content[last_log_size:]\n", " last_log_size = len(log_content)\n", "\n", " # تحقق من خطأ مبكر\n", " if 'ERROR' in new_content or 'ImportError' in new_content:\n", " print(f'\\n❌ خطأ في الخادم عند {elapsed}s:')\n", " # عرض آخر 1000 حرف من اللوج\n", " print(log_content[-1500:])\n", " server_proc.terminate()\n", " log_file.close()\n", " raise RuntimeError('Server failed to start. See log above.')\n", "\n", " # عرض ما تم تحميله\n", " if 'Loaded' in new_content or 'loaded' in new_content or 'Application' in new_content:\n", " for line in new_content.strip().split('\\n'):\n", " if any(k in line for k in ['INFO', 'Loaded', 'loaded', 'Application', 'WARNING']):\n", " print(f' {line.strip()}')\n", " else:\n", " mins = elapsed // 60\n", " secs = elapsed % 60\n", " print(f' ⏳ {mins}m {secs}s — يجري تحميل النماذج...')\n", " except RuntimeError:\n", " raise\n", " except Exception:\n", " print(f' ⏳ {elapsed}s elapsed...')\n", "\n", "log_file.flush()\n", "log_file.close()\n", "\n", "if not started:\n", " print('\\n❌ Server did not start after 15 minutes.')\n", " print('─── آخر سطور اللوج ───')\n", " with open(log_path) as f:\n", " print(f.read()[-3000:])\n", "else:\n", " print('\\n✅ Server started successfully!')\n", "\n", " # ─── Cloudflare Tunnel (مجاني — بدون حساب) ──────────────────\n", " import subprocess, re, threading, time\n", "\n", " # تثبيت cloudflared\n", " subprocess.run(\n", " ['wget', '-q', 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n", " '-O', '/usr/local/bin/cloudflared'],\n", " check=True\n", " )\n", " subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n", " print('✅ cloudflared installed')\n", "\n", " # تشغيل النفق\n", " cf_log = open('/content/cloudflared.log', 'w')\n", " cf_proc = subprocess.Popen(\n", " ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n", " stdout=cf_log, stderr=subprocess.STDOUT\n", " )\n", "\n", " # انتظار ظهور الرابط في اللوج\n", " public_url = None\n", " for _ in range(30):\n", " time.sleep(2)\n", " cf_log.flush()\n", " try:\n", " with open('/content/cloudflared.log') as f:\n", " content = f.read()\n", " match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n", " if match:\n", " public_url = match.group(0)\n", " break\n", " except Exception:\n", " pass\n", "\n", " if public_url:\n", " print('\\n' + '='*60)\n", " print('🌐 PUBLIC URL (الرابط العام — Cloudflare):')\n", " print(f' {public_url}')\n", " print('='*60)\n", " print(f'📖 Swagger: {public_url}/docs')\n", " print(f'💚 Health: {public_url}/health')\n", " print(f'🗄️ DB Summary: {public_url}/api/db/summary')\n", " print('='*60)\n", " print('\\n📋 انسخي هذا السطر والصقيه في الفرونت (index.html):')\n", " print(f\" const BASE = '{public_url}';\")\n", " else:\n", " print('⚠️ Cloudflare tunnel URL not found, check /content/cloudflared.log')\n", " with open('/content/cloudflared.log') as f:\n", " print(f.read()[-1000:])" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ─── تشغيل Cloudflare Tunnel بشكل منفصل (إذا فشل مع الخلية 8) ─\n", "# شغّلي هذه الخلية فقط إذا كان الخادم يعمل لكن الـ tunnel فشل\n", "\n", "import subprocess, re, time, os\n", "\n", "# تثبيت cloudflared إذا لم يُثبَّت\n", "if not os.path.isfile('/usr/local/bin/cloudflared'):\n", " subprocess.run(\n", " ['wget', '-q',\n", " 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n", " '-O', '/usr/local/bin/cloudflared'],\n", " check=True\n", " )\n", " subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n", " print('✅ cloudflared installed')\n", "else:\n", " print('✅ cloudflared already installed')\n", "\n", "# تشغيل النفق\n", "cf_log_path = '/content/cloudflared.log'\n", "cf_log = open(cf_log_path, 'w')\n", "cf_proc = subprocess.Popen(\n", " ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n", " stdout=cf_log, stderr=subprocess.STDOUT\n", ")\n", "\n", "print('⏳ Opening Cloudflare tunnel...')\n", "\n", "public_url = None\n", "for _ in range(30):\n", " time.sleep(2)\n", " cf_log.flush()\n", " try:\n", " with open(cf_log_path) as f:\n", " content = f.read()\n", " match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n", " if match:\n", " public_url = match.group(0)\n", " break\n", " except Exception:\n", " pass\n", "\n", "if public_url:\n", " print('\\n' + '='*60)\n", " print(f'🌐 PUBLIC URL: {public_url}')\n", " print(f'📖 Swagger: {public_url}/docs')\n", " print(f'💚 Health: {public_url}/health')\n", " print('='*60)\n", " print('\\n📋 الصقي هذا السطر في index.html:')\n", " print(f\" const BASE = '{public_url}';\")\n", "else:\n", " print('❌ لم يُعثر على URL. اللوج:')\n", " with open(cf_log_path) as f:\n", " print(f.read())\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 9: ربط الفرونت بـ Cloudflare URL\n", "\n", "بعد تشغيل الخلية السابقة، ستظهر رسالة مثل:\n", "```\n", "🌐 PUBLIC URL: https://xxxx-xxxx.trycloudflare.com\n", "```\n", "\n", "**الخلية أدناه تُحدّث الفرونت تلقائياً** — أو يمكنك التعديل يدوياً في `index.html`:\n", "```javascript\n", "const BASE = 'https://xxxx-xxxx.trycloudflare.com';\n", "```\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import subprocess, re, time, os\n", "\n", "# ── الخطوة 1: تثبيت cloudflared ──────────────────────────────\n", "if not os.path.isfile('/usr/local/bin/cloudflared'):\n", " subprocess.run([\n", " 'wget', '-q',\n", " 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n", " '-O', '/usr/local/bin/cloudflared'\n", " ], check=True)\n", " subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n", " print('✅ cloudflared installed')\n", "else:\n", " print('✅ cloudflared ready')\n", "\n", "# ── الخطوة 2: تشغيل النفق ────────────────────────────────────\n", "cf_log_path = '/content/cf.log'\n", "cf_log = open(cf_log_path, 'w')\n", "subprocess.Popen(\n", " ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n", " stdout=cf_log, stderr=subprocess.STDOUT\n", ")\n", "\n", "print('⏳ Opening Cloudflare tunnel...')\n", "\n", "# ── الخطوة 3: انتظار الرابط ───────────────────────────────────\n", "public_url = None\n", "for _ in range(30):\n", " time.sleep(2)\n", " cf_log.flush()\n", " with open(cf_log_path) as f:\n", " content = f.read()\n", " match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n", " if match:\n", " public_url = match.group(0)\n", " break\n", "\n", "if not public_url:\n", " print('❌ Tunnel failed. Log:')\n", " with open(cf_log_path) as f: print(f.read())\n", "else:\n", " # ── الخطوة 4: تحديث index.html تلقائياً ─────────────────\n", " frontend_path = '/content/drive/MyDrive/murshid_frontend/index.html'\n", "\n", " if os.path.isfile(frontend_path):\n", " with open(frontend_path, 'r', encoding='utf-8') as f:\n", " html = f.read()\n", " html_updated = re.sub(r\"const BASE = '[^']*';\",\n", " f\"const BASE = '{public_url}';\", html)\n", " with open(frontend_path, 'w', encoding='utf-8') as f:\n", " f.write(html_updated)\n", " print(f'✅ index.html updated automatically')\n", " else:\n", " print(f'⚠️ index.html not found — عدّليه يدوياً')\n", "\n", " print('\\n' + '='*60)\n", " print(f'🌐 PUBLIC URL: {public_url}')\n", " print(f'📖 Swagger: {public_url}/docs')\n", " print(f'💚 Health: {public_url}/health')\n", " print(f'🖥️ Frontend: {public_url}/index.html')\n", " print('='*60)\n", " print(f\"\\n📋 const BASE = '{public_url}';\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 10: اختبار الـ API\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import urllib.request, json\n", "\n", "# ─── Health Check ────────────────────────────────────────────\n", "with urllib.request.urlopen('http://localhost:8000/health') as r:\n", " health = json.load(r)\n", "\n", "print('=== Health Check ===')\n", "print(f\" status: {health['status']}\")\n", "print(f\" pipeline_mode: {health['pipeline_mode']}\")\n", "print(f\" llama_loaded: {health['components']['llama_loaded']}\")\n", "print(f\" embedder_loaded: {health['components']['embedder_loaded']}\")\n", "print(f\" logreg_loaded: {health['components']['logreg_loaded']}\")\n", "print(f\" cuda_available: {health['components']['cuda_available']}\")\n", "\n", "mode = health.get('pipeline_mode', 'unknown')\n", "if mode == 'full':\n", " print('\\n✅ FULL mode — نتائج مطابقة 100% للدفتر')\n", "elif mode == 'local':\n", " print('\\n⚠️ LOCAL mode — LLaMA لم يُحمَّل، تحققي من MURSHID_SKIP_LLM=false')\n", "else:\n", " print('\\n❌ LITE mode — تحققي من تثبيت torch والنماذج')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ─── تحليل قاعدة اختبار ──────────────────────────────────────\n", "import urllib.request, json\n", "\n", "test_rule = '''\n", " 18201\n", " ^634$|^4730$\n", " Windows: Security Enabled Global Group Deleted\n", " T1484\n", " group_deleted,win_group_deleted\n", "'''\n", "\n", "payload = json.dumps({'rule_xml': test_rule}).encode()\n", "req = urllib.request.Request(\n", " 'http://localhost:8000/rules/analyze',\n", " data=payload,\n", " headers={'Content-Type': 'application/json'},\n", " method='POST'\n", ")\n", "\n", "with urllib.request.urlopen(req) as r:\n", " result = json.load(r)\n", "\n", "print('=== Analyze Result ===')\n", "print(f\" rule_id: {result['rule_id']}\")\n", "print(f\" pipeline_mode: {result['pipeline_mode']}\")\n", "print(f\" summary: {result['summary']}\")\n", "print(f\"\\n TOP 5 Techniques:\")\n", "print(f\" {'Technique':<15} {'Conf%':>8} {'Proba':>8} {'Thr':>6} {'Gap':>8} {'Pred':>6}\")\n", "print(f\" {'-'*55}\")\n", "for r in result['all_results'][:5]:\n", " pred = '✅' if r['predicted'] else ' '\n", " print(f\" {pred} {r['technique_id']:<13} {r['confidence_percent']:>7.2f}%\"\n", " f\" {r['proba']:>8.4f} {r['threshold']:>6.2f} {r['gap']:>+8.4f}\")\n", "\n", "print(f\"\\n Detected: {len(result['detected'])} technique(s)\")\n", "for d in result['detected']:\n", " print(f\" ✅ {d['technique_id']} — {d['confidence_percent']}%\")" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ─── قوالب WQL للتقنية المكتشفة ──────────────────────────────\n", "if result['detected']:\n", " top_technique = result['detected'][0]['technique_id']\n", "\n", " with urllib.request.urlopen(f'http://localhost:8000/queries/{top_technique}') as r:\n", " queries = json.load(r)\n", "\n", " print(f'=== WQL Templates for {top_technique} ===')\n", " for i, q in enumerate(queries, 1):\n", " print(f\"\\n [{i}] {q.get('purpose', 'N/A')}\")\n", " print(f\" Query: {q['wql_query'][:120]}...\")\n", " print(f\" Note: {q.get('note', 'N/A')}\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 11: تصدير النتائج (اختياري)\n", "\n", "لحفظ النتائج بصيغة JSON لاستخدامها لاحقاً على الجهاز المحلي\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# ─── تحليل قائمة من القواعد وتصديرها ───────────────────────\n", "import urllib.request, json, os\n", "\n", "# ✏️ أضيفي Rule IDs التي تريدين تحليلها\n", "# يمكنك قراءتها من ملف\n", "test_ids_path = f'{NEEDED_PATH}/test_rule_ids.json'\n", "\n", "if os.path.isfile(test_ids_path):\n", " with open(test_ids_path) as f:\n", " rule_ids = json.load(f)\n", " print(f'Loaded {len(rule_ids)} rule IDs from test_rule_ids.json')\n", "else:\n", " # قواعد تجريبية\n", " rule_ids = ['18205']\n", " print('Using default test rule')\n", "\n", "print(f'Processing {len(rule_ids)} rules...')\n", "\n", "export_results = []\n", "\n", "for rule_id in rule_ids:\n", " try:\n", " with urllib.request.urlopen(f'http://localhost:8000/results/{rule_id}') as r:\n", " data = json.load(r)\n", " data['source'] = 'colab_full_mode'\n", " export_results.append(data)\n", " detected = len(data.get('detected', []))\n", " top = data['mappings'][0] if data['mappings'] else {}\n", " print(f\" ✅ {rule_id}: {top.get('technique_id','?')} ({top.get('confidence_percent','?')}%) — {detected} detected\")\n", " except Exception as e:\n", " print(f\" ⚠️ {rule_id}: {e}\")\n", "\n", "# حفظ النتائج\n", "export_path = f'{NEEDED_PATH}/murshid_full_results.json'\n", "with open(export_path, 'w', encoding='utf-8') as f:\n", " json.dump(export_results, f, ensure_ascii=False, indent=2)\n", "\n", "print(f'\\n✅ Exported {len(export_results)} results to:')\n", "print(f' {export_path}')\n", "print('\\nيمكنك الآن استيراد هذا الملف في الباكند المحلي')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## الخلية 12: إيقاف الخادم (عند الانتهاء)\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# إيقاف الخادم وإغلاق ngrok\n", "try:\n", " from pyngrok import ngrok\n", " ngrok.kill()\n", " print('✅ ngrok tunnel closed')\n", "except Exception:\n", " pass\n", "\n", "try:\n", " server_proc.terminate()\n", " print('✅ Server stopped')\n", "except Exception:\n", " pass" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## ملاحظات مهمة\n", "\n", "### إذا انقطع الاتصال بـ Colab\n", "- الخادم يتوقف تلقائياً\n", "- أعيدي تشغيل الخلايا من الخلية 8\n", "- رابط ngrok سيتغيّر — عدّلي الفرونت بالرابط الجديد\n", "\n", "### إذا ظهر خطأ في LLaMA\n", "- تأكدي أن لديك صلاحية الوصول للنموذج: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct\n", "- تأكدي من صحة HF_TOKEN\n", "\n", "### المقارنة مع الجهاز المحلي\n", "| | Colab (FULL) | الجهاز المحلي (LOCAL) |\n", "|--|-------------|----------------------|\n", "| LLaMA | ✅ | ❌ |\n", "| T1484 confidence | **94.76%** | 89.29% |\n", "| القرار النهائي | T1484 ✅ | T1484 ✅ |\n", "\n", "### للعرض التقديمي\n", "1. شغّلي الخلايا 1-8 مسبقاً (قبل العرض بـ 15 دقيقة)\n", "2. انسخي رابط ngrok\n", "3. عدّلي الفرونت\n", "4. افتحي `https://xxxx.ngrok-free.app/index.html`\n" ] } ], "metadata": { "accelerator": "GPU", "colab": { "gpuType": "T4", "machine_shape": "hm", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "name": "python3" }, "language_info": { "name": "python" } }, "nbformat": 4, "nbformat_minor": 0 }