{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "# 🛡️ Murshid Backend — Full Mode on Colab\n",
        "\n",
        "**مُرشِد | From Alerts to Guidance: MITRE ATT&CK-Aligned Techniques Mapping for SOC Analysts**\n",
        "\n",
        "---\n",
        "\n",
        "## 📁 الملفات المطلوبة على Google Drive\n",
        "\n",
        "```\n",
        "MyDrive/\n",
        "├── murshid_backend_for_drive.zip   ← ارفعيه ثم شغّلي الخلية 2b لاستخراجه\n",
        "│                                     أو\n",
        "├── murshid_backend/                ← إذا استخرجته مسبقاً\n",
        "│   ├── app/\n",
        "│   ├── alembic/\n",
        "│   ├── scripts/\n",
        "│   ├── alembic.ini\n",
        "│   └── requirements.txt\n",
        "│\n",
        "└── Needed/\n",
        "    ├── murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n",
        "    ├── murshid_logreg_thresholds_manual_oof_pcatuned.npy\n",
        "    ├── murshid_label_columns.json\n",
        "    └── murshid_query_template_structure_clean_shared.xlsx\n",
        "```\n",
        "\n",
        "## تعليمات التشغيل\n",
        "\n",
        "### المتطلبات قبل التشغيل\n",
        "1. ✅ **GPU مُفعَّل:** `Runtime → Change runtime type → T4 GPU`\n",
        "2. ✅ **Google Drive مُتَّصل** (يحتوي مجلد `Needed` بملفات النماذج)\n",
        "3. ✅ **مجلد `murshid_backend`** على Drive أو رفعه يدوياً\n",
        "\n",
        "### الملفات المطلوبة في Google Drive\n",
        "```\n",
        "MyDrive/\n",
        "├── Needed/\n",
        "│   ├── murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n",
        "│   ├── murshid_logreg_thresholds_manual_oof_pcatuned.npy\n",
        "│   ├── murshid_label_columns.json\n",
        "│   └── murshid_query_template_structure_clean_shared.xlsx\n",
        "└── murshid_backend/   ← مجلد الباكند كاملاً\n",
        "```\n",
        "\n",
        "### ترتيب التشغيل\n",
        "**شغّلي الخلايا بالترتيب من الأعلى للأسفل — لا تتخطّي أي خلية**\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 1: التحقق من GPU\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import torch\n",
        "\n",
        "print('CUDA available:', torch.cuda.is_available())\n",
        "if torch.cuda.is_available():\n",
        "    print('GPU:', torch.cuda.get_device_name(0))\n",
        "    print('Memory:', round(torch.cuda.get_device_properties(0).total_memory / 1e9, 1), 'GB')\n",
        "else:\n",
        "    print('⚠️  لا يوجد GPU — غيّري Runtime إلى T4 من القائمة أعلاه')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 2: تحميل Google Drive\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 3: تجهيز الباكند في /content\n",
        "\n",
        "> تقوم هذه الخلية تلقائياً بـ:\n",
        "> 1. استخراج ZIP من Drive (إذا كان ZIP موجوداً ولم يُستخرج بعد)\n",
        "> 2. نسخ مجلد `murshid_backend` إلى `/content` (أسرع للقراءة)\n",
        "> 3. ضبط Python path\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print('(هذه الخلية فارغة — الكود انتقل إلى الخلية 3 أدناه)')\n",
        "\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "from google.colab import drive\n",
        "import os\n",
        "\n",
        "drive.mount('/content/drive')\n",
        "\n",
        "# ✏️ عدّلي هذا المسار إذا كان مجلدك مختلفاً\n",
        "NEEDED_PATH  = '/content/drive/MyDrive/Needed'\n",
        "BACKEND_PATH = '/content/drive/MyDrive/murshid_backend'\n",
        "ZIP_PATH     = '/content/drive/MyDrive/murshid_backend_for_drive.zip'\n",
        "\n",
        "print('=' * 55)\n",
        "print('📂 Checking Google Drive files...')\n",
        "print('=' * 55)\n",
        "\n",
        "# ── التحقق من ملفات Needed ────────────────────────────────────\n",
        "print('\\n📁 Needed/ (model files):')\n",
        "required_files = {\n",
        "    'murshid_logreg_pipeline_manual_oof_pcatuned.joblib': 'LogReg model',\n",
        "    'murshid_logreg_thresholds_manual_oof_pcatuned.npy':  'LogReg thresholds',\n",
        "    'murshid_label_columns.json':                          'Technique names',\n",
        "}\n",
        "\n",
        "models_ok = True\n",
        "for fname, desc in required_files.items():\n",
        "    path   = f'{NEEDED_PATH}/{fname}'\n",
        "    exists = os.path.isfile(path)\n",
        "    size   = f'{os.path.getsize(path)/1024:.0f} KB' if exists else ''\n",
        "    status = '✅' if exists else '❌'\n",
        "    print(f'  {status} {fname}  {size}')\n",
        "    if not exists:\n",
        "        models_ok = False\n",
        "\n",
        "excel_path = f'{NEEDED_PATH}/murshid_query_template_structure_clean_shared.xlsx'\n",
        "excel_ok   = os.path.isfile(excel_path)\n",
        "print(f'  {\"✅\" if excel_ok else \"⚠️ \"} murshid_query_template_structure_clean_shared.xlsx  (optional)')\n",
        "\n",
        "# ── التحقق من الباكند ─────────────────────────────────────────\n",
        "print('\\n📁 murshid_backend/ (backend code):')\n",
        "backend_ok  = os.path.isdir(BACKEND_PATH)\n",
        "zip_ok      = os.path.isfile(ZIP_PATH)\n",
        "\n",
        "if backend_ok:\n",
        "    fcount = sum(len(f) for _, _, f in os.walk(BACKEND_PATH))\n",
        "    print(f'  ✅ murshid_backend/  ({fcount} files)')\n",
        "elif zip_ok:\n",
        "    zsize = f'{os.path.getsize(ZIP_PATH)/1024:.0f} KB'\n",
        "    print(f'  📦 murshid_backend_for_drive.zip  ({zsize}) — سيُستخرج تلقائياً في الخلية 3')\n",
        "else:\n",
        "    print(f'  ❌ murshid_backend/ غير موجود')\n",
        "    print(f'  ❌ murshid_backend_for_drive.zip  غير موجود')\n",
        "    print(f'\\n  ⚠️  ارفعي murshid_backend_for_drive.zip إلى:')\n",
        "    print(f'     Google Drive → My Drive')\n",
        "\n",
        "# ── ملخص ──────────────────────────────────────────────────────\n",
        "print('\\n' + '=' * 55)\n",
        "if models_ok and (backend_ok or zip_ok):\n",
        "    print('✅ كل شيء جاهز — تابعي تشغيل الخلايا')\n",
        "elif not models_ok:\n",
        "    print('❌ ملفات النماذج مفقودة من Needed/ — يجب رفعها أولاً')\n",
        "else:\n",
        "    print('❌ ملفات الباكند مفقودة — ارفعي ZIP أولاً')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 3: نسخ الباكند إلى /content\n",
        "\n",
        "> نسخ الملفات من Drive إلى `/content` لتسريع القراءة\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import shutil, os, zipfile, sys\n",
        "\n",
        "DRIVE_BASE   = '/content/drive/MyDrive'\n",
        "ZIP_PATH     = f'{DRIVE_BASE}/murshid_backend_for_drive.zip'\n",
        "BACKEND_DRIVE= f'{DRIVE_BASE}/murshid_backend'\n",
        "BACKEND_LOCAL= '/content/murshid_backend'\n",
        "\n",
        "# ── الخطوة 1: استخراج ZIP من Drive إذا لزم ────────────────────\n",
        "if not os.path.isdir(BACKEND_DRIVE):\n",
        "    if os.path.isfile(ZIP_PATH):\n",
        "        print(f'📦 ZIP found — extracting to Drive...')\n",
        "        with zipfile.ZipFile(ZIP_PATH, 'r') as z:\n",
        "            z.extractall(DRIVE_BASE)\n",
        "        print(f'✅ Extracted to {BACKEND_DRIVE}')\n",
        "    else:\n",
        "        print('❌ ERROR: مجلد murshid_backend غير موجود على Drive')\n",
        "        print(f'   المطلوب: {BACKEND_DRIVE}')\n",
        "        print(f'   أو رفع: {ZIP_PATH}')\n",
        "        raise FileNotFoundError(f'Backend not found. Upload murshid_backend_for_drive.zip to Google Drive MyDrive.')\n",
        "else:\n",
        "    print(f'✅ murshid_backend found on Drive: {BACKEND_DRIVE}')\n",
        "\n",
        "# ── الخطوة 2: نسخ إلى /content (أسرع بكثير من Drive أثناء التشغيل) ─\n",
        "if os.path.exists(BACKEND_LOCAL):\n",
        "    shutil.rmtree(BACKEND_LOCAL)\n",
        "\n",
        "shutil.copytree(\n",
        "    BACKEND_DRIVE,\n",
        "    BACKEND_LOCAL,\n",
        "    ignore=shutil.ignore_patterns('__pycache__', '*.pyc', '.venv', '*.db', '*.log')\n",
        ")\n",
        "\n",
        "# ── الخطوة 3: إضافة للـ Python path ──────────────────────────\n",
        "if BACKEND_LOCAL not in sys.path:\n",
        "    sys.path.insert(0, BACKEND_LOCAL)\n",
        "\n",
        "os.chdir(BACKEND_LOCAL)\n",
        "\n",
        "# ── تحقق ─────────────────────────────────────────────────────\n",
        "file_count = sum(len(files) for _, _, files in os.walk(BACKEND_LOCAL))\n",
        "print(f'✅ Backend ready at {BACKEND_LOCAL} ({file_count} files)')\n",
        "print(f'✅ Working dir: {os.getcwd()}')\n",
        "\n",
        "# عرض الهيكل\n",
        "print('\\nStructure:')\n",
        "for item in sorted(os.listdir(BACKEND_LOCAL)):\n",
        "    full = os.path.join(BACKEND_LOCAL, item)\n",
        "    if os.path.isdir(full):\n",
        "        sub_count = len(os.listdir(full))\n",
        "        print(f'  📁 {item}/  ({sub_count} items)')\n",
        "    else:\n",
        "        size = os.path.getsize(full)\n",
        "        print(f'  📄 {item}  ({size:,} bytes)')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 4: تثبيت المتطلبات\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "print('📦 Installing requirements...')\n",
        "\n",
        "# ── الحزم الأساسية للباكند ──────────────────────────────────────\n",
        "!pip install -q \\\n",
        "    fastapi==0.115.0 \\\n",
        "    \"uvicorn[standard]==0.32.0\" \\\n",
        "    pydantic==2.9.0 \\\n",
        "    pydantic-settings==2.6.0 \\\n",
        "    python-dotenv==1.0.0 \\\n",
        "    sqlalchemy==2.0.0 \\\n",
        "    alembic==1.13.0 \\\n",
        "    aiofiles \\\n",
        "    scikit-learn==1.6.1 \\\n",
        "    joblib \\\n",
        "    lxml \\\n",
        "    openpyxl \\\n",
        "    nest-asyncio \\\n",
        "    pyngrok\n",
        "\n",
        "# ── bitsandbytes: مطلوب لتحميل LLaMA بـ 4-bit على GPU ─────────\n",
        "print('📦 Installing bitsandbytes (required for LLaMA 4-bit)...')\n",
        "!pip install -q -U \"bitsandbytes>=0.46.1\"\n",
        "\n",
        "# ── accelerate: مطلوب لـ device_map=\"auto\" ────────────────────\n",
        "!pip install -q -U accelerate\n",
        "\n",
        "# ── تحقق من التثبيت ──────────────────────────────────────────\n",
        "import importlib\n",
        "for pkg in ['bitsandbytes', 'accelerate', 'fastapi', 'sklearn']:\n",
        "    try:\n",
        "        mod = importlib.import_module(pkg if pkg != 'sklearn' else 'sklearn')\n",
        "        ver = getattr(mod, '__version__', '?')\n",
        "        print(f'  ✅ {pkg}=={ver}')\n",
        "    except ImportError:\n",
        "        print(f'  ❌ {pkg} — فشل التثبيت')\n",
        "\n",
        "print('\\n✅ All requirements installed')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 5: إعداد ملف .env\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import os\n",
        "\n",
        "# ✏️ ضعي HF Token هنا إذا لم تُضيفيه عبر Colab Secrets\n",
        "HF_TOKEN = os.environ.get('HF_TOKEN', 'ادخل التوكن')\n",
        "\n",
        "env_content = f\"\"\"# Auto-generated .env for Colab FULL mode\n",
        "MURSHID_DB_URL=sqlite:////content/murshid.db\n",
        "MURSHID_MODELS_DIR={NEEDED_PATH}\n",
        "HF_TOKEN={HF_TOKEN}\n",
        "MURSHID_SKIP_LLM=false\n",
        "SECRET_KEY=murshid_colab_2026\n",
        "LLAMA_MODEL_ID=meta-llama/Meta-Llama-3-8B-Instruct\n",
        "EMBED_MODEL_ID=ehsanaghaei/SecureBERT_Plus\n",
        "LOGREG_JOBLIB=murshid_logreg_pipeline_manual_oof_pcatuned.joblib\n",
        "LOGREG_THRESHOLDS_NPY=murshid_logreg_thresholds_manual_oof_pcatuned.npy\n",
        "LABEL_COLUMNS_JSON=murshid_label_columns.json\n",
        "\"\"\"\n",
        "\n",
        "env_path = '/content/murshid_backend/.env'\n",
        "with open(env_path, 'w') as f:\n",
        "    f.write(env_content)\n",
        "\n",
        "print('✅ .env created at', env_path)\n",
        "print('\\nContents:')\n",
        "with open(env_path) as f:\n",
        "    for line in f:\n",
        "        if 'TOKEN' in line or 'SECRET' in line:\n",
        "            key = line.split('=')[0]\n",
        "            print(f'  {key}=****')\n",
        "        else:\n",
        "            print(' ', line.rstrip())"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 6: تهجير قاعدة البيانات (Alembic)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import subprocess, os\n",
        "\n",
        "os.chdir('/content/murshid_backend')\n",
        "\n",
        "result = subprocess.run(\n",
        "    ['python', '-m', 'alembic', 'upgrade', 'head'],\n",
        "    capture_output=True, text=True\n",
        ")\n",
        "\n",
        "print(result.stdout)\n",
        "if result.stderr:\n",
        "    print(result.stderr)\n",
        "\n",
        "import os\n",
        "db_exists = os.path.isfile('/content/murshid.db')\n",
        "print('✅ Database ready:', '/content/murshid.db' if db_exists else '❌ لم يُنشأ')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 7: استيراد قوالب WQL من Excel\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import sys\n",
        "sys.path.insert(0, '/content/murshid_backend')\n",
        "os.chdir('/content/murshid_backend')\n",
        "\n",
        "excel_path = f'{NEEDED_PATH}/murshid_query_template_structure_clean_shared.xlsx'\n",
        "\n",
        "if os.path.isfile(excel_path):\n",
        "    from app.db.session import SessionLocal\n",
        "    from scripts.import_excel_templates import run as import_excel\n",
        "\n",
        "    db = SessionLocal()\n",
        "    try:\n",
        "        result = import_excel(db, replace=False)\n",
        "        print('✅ Excel import result:')\n",
        "        for k, v in result.items():\n",
        "            print(f'   {k}: {v}')\n",
        "    finally:\n",
        "        db.close()\n",
        "else:\n",
        "    print(f'⚠️  Excel file not found at: {excel_path}')\n",
        "    print('   يمكنك المتابعة — القوالب ستُضاف لاحقاً يدوياً')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 8: تشغيل FastAPI + ngrok\n",
        "\n",
        "> ⏳ هذه الخلية تأخذ **5-10 دقائق** لتحميل LLaMA (4.5GB) و SecureBERT+\n",
        "\n",
        "> 🔑 **الرابط العام سيظهر في النهاية** — انسخيه للفرونت\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import subprocess, time, os, sys, urllib.request\n",
        "import nest_asyncio\n",
        "nest_asyncio.apply()\n",
        "\n",
        "os.chdir('/content/murshid_backend')\n",
        "\n",
        "# ─── التحقق من bitsandbytes قبل تشغيل الخادم ─────────────────\n",
        "try:\n",
        "    import bitsandbytes as bnb\n",
        "    print(f'✅ bitsandbytes {bnb.__version__}')\n",
        "except ImportError:\n",
        "    print('❌ bitsandbytes غير مثبّت — شغّلي الخلية 4 أولاً')\n",
        "    raise\n",
        "\n",
        "# ─── تشغيل uvicorn ───────────────────────────────────────────\n",
        "log_path = '/content/murshid_server.log'\n",
        "log_file = open(log_path, 'w')\n",
        "\n",
        "server_proc = subprocess.Popen(\n",
        "    [\n",
        "        'python', '-m', 'uvicorn', 'app.main:app',\n",
        "        '--host', '0.0.0.0',\n",
        "        '--port', '8000',\n",
        "        '--log-level', 'info'\n",
        "    ],\n",
        "    cwd='/content/murshid_backend',\n",
        "    stdout=log_file,\n",
        "    stderr=subprocess.STDOUT\n",
        ")\n",
        "\n",
        "print('⏳ Loading LLaMA 3 8B + SecureBERT+...')\n",
        "print('   جاري التحميل — انتظري حتى تظهر الرسالة النهائية')\n",
        "\n",
        "# ─── انتظار ذكي مع عرض اللوج ────────────────────────────────\n",
        "started = False\n",
        "last_log_size = 0\n",
        "\n",
        "for i in range(180):  # 15 دقيقة كحد أقصى\n",
        "    time.sleep(5)\n",
        "\n",
        "    # تحقق إذا بدأ الخادم\n",
        "    try:\n",
        "        resp = urllib.request.urlopen('http://localhost:8000/health', timeout=3)\n",
        "        if resp.status == 200:\n",
        "            started = True\n",
        "            break\n",
        "    except Exception:\n",
        "        pass\n",
        "\n",
        "    # عرض اللوج الجديد كل 30 ثانية\n",
        "    if i % 6 == 0:\n",
        "        elapsed = (i + 1) * 5\n",
        "        log_file.flush()\n",
        "        try:\n",
        "            with open(log_path) as f:\n",
        "                log_content = f.read()\n",
        "            new_content = log_content[last_log_size:]\n",
        "            last_log_size = len(log_content)\n",
        "\n",
        "            # تحقق من خطأ مبكر\n",
        "            if 'ERROR' in new_content or 'ImportError' in new_content:\n",
        "                print(f'\\n❌ خطأ في الخادم عند {elapsed}s:')\n",
        "                # عرض آخر 1000 حرف من اللوج\n",
        "                print(log_content[-1500:])\n",
        "                server_proc.terminate()\n",
        "                log_file.close()\n",
        "                raise RuntimeError('Server failed to start. See log above.')\n",
        "\n",
        "            # عرض ما تم تحميله\n",
        "            if 'Loaded' in new_content or 'loaded' in new_content or 'Application' in new_content:\n",
        "                for line in new_content.strip().split('\\n'):\n",
        "                    if any(k in line for k in ['INFO', 'Loaded', 'loaded', 'Application', 'WARNING']):\n",
        "                        print(f'   {line.strip()}')\n",
        "            else:\n",
        "                mins = elapsed // 60\n",
        "                secs = elapsed % 60\n",
        "                print(f'   ⏳ {mins}m {secs}s — يجري تحميل النماذج...')\n",
        "        except RuntimeError:\n",
        "            raise\n",
        "        except Exception:\n",
        "            print(f'   ⏳ {elapsed}s elapsed...')\n",
        "\n",
        "log_file.flush()\n",
        "log_file.close()\n",
        "\n",
        "if not started:\n",
        "    print('\\n❌ Server did not start after 15 minutes.')\n",
        "    print('─── آخر سطور اللوج ───')\n",
        "    with open(log_path) as f:\n",
        "        print(f.read()[-3000:])\n",
        "else:\n",
        "    print('\\n✅ Server started successfully!')\n",
        "\n",
        "    # ─── Cloudflare Tunnel (مجاني — بدون حساب) ──────────────────\n",
        "    import subprocess, re, threading, time\n",
        "\n",
        "    # تثبيت cloudflared\n",
        "    subprocess.run(\n",
        "        ['wget', '-q', 'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n",
        "         '-O', '/usr/local/bin/cloudflared'],\n",
        "        check=True\n",
        "    )\n",
        "    subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n",
        "    print('✅ cloudflared installed')\n",
        "\n",
        "    # تشغيل النفق\n",
        "    cf_log = open('/content/cloudflared.log', 'w')\n",
        "    cf_proc = subprocess.Popen(\n",
        "        ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n",
        "        stdout=cf_log, stderr=subprocess.STDOUT\n",
        "    )\n",
        "\n",
        "    # انتظار ظهور الرابط في اللوج\n",
        "    public_url = None\n",
        "    for _ in range(30):\n",
        "        time.sleep(2)\n",
        "        cf_log.flush()\n",
        "        try:\n",
        "            with open('/content/cloudflared.log') as f:\n",
        "                content = f.read()\n",
        "            match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n",
        "            if match:\n",
        "                public_url = match.group(0)\n",
        "                break\n",
        "        except Exception:\n",
        "            pass\n",
        "\n",
        "    if public_url:\n",
        "        print('\\n' + '='*60)\n",
        "        print('🌐 PUBLIC URL (الرابط العام — Cloudflare):')\n",
        "        print(f'   {public_url}')\n",
        "        print('='*60)\n",
        "        print(f'📖 Swagger:      {public_url}/docs')\n",
        "        print(f'💚 Health:       {public_url}/health')\n",
        "        print(f'🗄️  DB Summary:  {public_url}/api/db/summary')\n",
        "        print('='*60)\n",
        "        print('\\n📋 انسخي هذا السطر والصقيه في الفرونت (index.html):')\n",
        "        print(f\"   const BASE = '{public_url}';\")\n",
        "    else:\n",
        "        print('⚠️  Cloudflare tunnel URL not found, check /content/cloudflared.log')\n",
        "        with open('/content/cloudflared.log') as f:\n",
        "            print(f.read()[-1000:])"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# ─── تشغيل Cloudflare Tunnel بشكل منفصل (إذا فشل مع الخلية 8) ─\n",
        "# شغّلي هذه الخلية فقط إذا كان الخادم يعمل لكن الـ tunnel فشل\n",
        "\n",
        "import subprocess, re, time, os\n",
        "\n",
        "# تثبيت cloudflared إذا لم يُثبَّت\n",
        "if not os.path.isfile('/usr/local/bin/cloudflared'):\n",
        "    subprocess.run(\n",
        "        ['wget', '-q',\n",
        "         'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n",
        "         '-O', '/usr/local/bin/cloudflared'],\n",
        "        check=True\n",
        "    )\n",
        "    subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n",
        "    print('✅ cloudflared installed')\n",
        "else:\n",
        "    print('✅ cloudflared already installed')\n",
        "\n",
        "# تشغيل النفق\n",
        "cf_log_path = '/content/cloudflared.log'\n",
        "cf_log = open(cf_log_path, 'w')\n",
        "cf_proc = subprocess.Popen(\n",
        "    ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n",
        "    stdout=cf_log, stderr=subprocess.STDOUT\n",
        ")\n",
        "\n",
        "print('⏳ Opening Cloudflare tunnel...')\n",
        "\n",
        "public_url = None\n",
        "for _ in range(30):\n",
        "    time.sleep(2)\n",
        "    cf_log.flush()\n",
        "    try:\n",
        "        with open(cf_log_path) as f:\n",
        "            content = f.read()\n",
        "        match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n",
        "        if match:\n",
        "            public_url = match.group(0)\n",
        "            break\n",
        "    except Exception:\n",
        "        pass\n",
        "\n",
        "if public_url:\n",
        "    print('\\n' + '='*60)\n",
        "    print(f'🌐 PUBLIC URL:  {public_url}')\n",
        "    print(f'📖 Swagger:     {public_url}/docs')\n",
        "    print(f'💚 Health:      {public_url}/health')\n",
        "    print('='*60)\n",
        "    print('\\n📋 الصقي هذا السطر في index.html:')\n",
        "    print(f\"   const BASE = '{public_url}';\")\n",
        "else:\n",
        "    print('❌ لم يُعثر على URL. اللوج:')\n",
        "    with open(cf_log_path) as f:\n",
        "        print(f.read())\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 9: ربط الفرونت بـ Cloudflare URL\n",
        "\n",
        "بعد تشغيل الخلية السابقة، ستظهر رسالة مثل:\n",
        "```\n",
        "🌐 PUBLIC URL: https://xxxx-xxxx.trycloudflare.com\n",
        "```\n",
        "\n",
        "**الخلية أدناه تُحدّث الفرونت تلقائياً** — أو يمكنك التعديل يدوياً في `index.html`:\n",
        "```javascript\n",
        "const BASE = 'https://xxxx-xxxx.trycloudflare.com';\n",
        "```\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import subprocess, re, time, os\n",
        "\n",
        "# ── الخطوة 1: تثبيت cloudflared ──────────────────────────────\n",
        "if not os.path.isfile('/usr/local/bin/cloudflared'):\n",
        "    subprocess.run([\n",
        "        'wget', '-q',\n",
        "        'https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64',\n",
        "        '-O', '/usr/local/bin/cloudflared'\n",
        "    ], check=True)\n",
        "    subprocess.run(['chmod', '+x', '/usr/local/bin/cloudflared'], check=True)\n",
        "    print('✅ cloudflared installed')\n",
        "else:\n",
        "    print('✅ cloudflared ready')\n",
        "\n",
        "# ── الخطوة 2: تشغيل النفق ────────────────────────────────────\n",
        "cf_log_path = '/content/cf.log'\n",
        "cf_log = open(cf_log_path, 'w')\n",
        "subprocess.Popen(\n",
        "    ['cloudflared', 'tunnel', '--url', 'http://localhost:8000'],\n",
        "    stdout=cf_log, stderr=subprocess.STDOUT\n",
        ")\n",
        "\n",
        "print('⏳ Opening Cloudflare tunnel...')\n",
        "\n",
        "# ── الخطوة 3: انتظار الرابط ───────────────────────────────────\n",
        "public_url = None\n",
        "for _ in range(30):\n",
        "    time.sleep(2)\n",
        "    cf_log.flush()\n",
        "    with open(cf_log_path) as f:\n",
        "        content = f.read()\n",
        "    match = re.search(r'https://[a-z0-9\\-]+\\.trycloudflare\\.com', content)\n",
        "    if match:\n",
        "        public_url = match.group(0)\n",
        "        break\n",
        "\n",
        "if not public_url:\n",
        "    print('❌ Tunnel failed. Log:')\n",
        "    with open(cf_log_path) as f: print(f.read())\n",
        "else:\n",
        "    # ── الخطوة 4: تحديث index.html تلقائياً ─────────────────\n",
        "    frontend_path = '/content/drive/MyDrive/murshid_frontend/index.html'\n",
        "\n",
        "    if os.path.isfile(frontend_path):\n",
        "        with open(frontend_path, 'r', encoding='utf-8') as f:\n",
        "            html = f.read()\n",
        "        html_updated = re.sub(r\"const BASE = '[^']*';\",\n",
        "                              f\"const BASE = '{public_url}';\", html)\n",
        "        with open(frontend_path, 'w', encoding='utf-8') as f:\n",
        "            f.write(html_updated)\n",
        "        print(f'✅ index.html updated automatically')\n",
        "    else:\n",
        "        print(f'⚠️  index.html not found — عدّليه يدوياً')\n",
        "\n",
        "    print('\\n' + '='*60)\n",
        "    print(f'🌐 PUBLIC URL:   {public_url}')\n",
        "    print(f'📖 Swagger:      {public_url}/docs')\n",
        "    print(f'💚 Health:       {public_url}/health')\n",
        "    print(f'🖥️  Frontend:    {public_url}/index.html')\n",
        "    print('='*60)\n",
        "    print(f\"\\n📋 const BASE = '{public_url}';\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 10: اختبار الـ API\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "import urllib.request, json\n",
        "\n",
        "# ─── Health Check ────────────────────────────────────────────\n",
        "with urllib.request.urlopen('http://localhost:8000/health') as r:\n",
        "    health = json.load(r)\n",
        "\n",
        "print('=== Health Check ===')\n",
        "print(f\"  status:          {health['status']}\")\n",
        "print(f\"  pipeline_mode:   {health['pipeline_mode']}\")\n",
        "print(f\"  llama_loaded:    {health['components']['llama_loaded']}\")\n",
        "print(f\"  embedder_loaded: {health['components']['embedder_loaded']}\")\n",
        "print(f\"  logreg_loaded:   {health['components']['logreg_loaded']}\")\n",
        "print(f\"  cuda_available:  {health['components']['cuda_available']}\")\n",
        "\n",
        "mode = health.get('pipeline_mode', 'unknown')\n",
        "if mode == 'full':\n",
        "    print('\\n✅ FULL mode — نتائج مطابقة 100% للدفتر')\n",
        "elif mode == 'local':\n",
        "    print('\\n⚠️  LOCAL mode — LLaMA لم يُحمَّل، تحققي من MURSHID_SKIP_LLM=false')\n",
        "else:\n",
        "    print('\\n❌ LITE mode — تحققي من تثبيت torch والنماذج')"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# ─── تحليل قاعدة اختبار ──────────────────────────────────────\n",
        "import urllib.request, json\n",
        "\n",
        "test_rule = '''<rule id=\"18205\" level=\"5\">\n",
        "  <if_sid>18201</if_sid>\n",
        "  <id>^634$|^4730$</id>\n",
        "  <description>Windows: Security Enabled Global Group Deleted</description>\n",
        "  <mitre><id>T1484</id></mitre>\n",
        "  <group>group_deleted,win_group_deleted</group>\n",
        "</rule>'''\n",
        "\n",
        "payload = json.dumps({'rule_xml': test_rule}).encode()\n",
        "req = urllib.request.Request(\n",
        "    'http://localhost:8000/rules/analyze',\n",
        "    data=payload,\n",
        "    headers={'Content-Type': 'application/json'},\n",
        "    method='POST'\n",
        ")\n",
        "\n",
        "with urllib.request.urlopen(req) as r:\n",
        "    result = json.load(r)\n",
        "\n",
        "print('=== Analyze Result ===')\n",
        "print(f\"  rule_id:       {result['rule_id']}\")\n",
        "print(f\"  pipeline_mode: {result['pipeline_mode']}\")\n",
        "print(f\"  summary:       {result['summary']}\")\n",
        "print(f\"\\n  TOP 5 Techniques:\")\n",
        "print(f\"  {'Technique':<15} {'Conf%':>8} {'Proba':>8} {'Thr':>6} {'Gap':>8} {'Pred':>6}\")\n",
        "print(f\"  {'-'*55}\")\n",
        "for r in result['all_results'][:5]:\n",
        "    pred = '✅' if r['predicted'] else '  '\n",
        "    print(f\"  {pred} {r['technique_id']:<13} {r['confidence_percent']:>7.2f}%\"\n",
        "          f\" {r['proba']:>8.4f} {r['threshold']:>6.2f} {r['gap']:>+8.4f}\")\n",
        "\n",
        "print(f\"\\n  Detected: {len(result['detected'])} technique(s)\")\n",
        "for d in result['detected']:\n",
        "    print(f\"    ✅ {d['technique_id']} — {d['confidence_percent']}%\")"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# ─── قوالب WQL للتقنية المكتشفة ──────────────────────────────\n",
        "if result['detected']:\n",
        "    top_technique = result['detected'][0]['technique_id']\n",
        "\n",
        "    with urllib.request.urlopen(f'http://localhost:8000/queries/{top_technique}') as r:\n",
        "        queries = json.load(r)\n",
        "\n",
        "    print(f'=== WQL Templates for {top_technique} ===')\n",
        "    for i, q in enumerate(queries, 1):\n",
        "        print(f\"\\n  [{i}] {q.get('purpose', 'N/A')}\")\n",
        "        print(f\"  Query: {q['wql_query'][:120]}...\")\n",
        "        print(f\"  Note:  {q.get('note', 'N/A')}\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 11: تصدير النتائج (اختياري)\n",
        "\n",
        "لحفظ النتائج بصيغة JSON لاستخدامها لاحقاً على الجهاز المحلي\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# ─── تحليل قائمة من القواعد وتصديرها ───────────────────────\n",
        "import urllib.request, json, os\n",
        "\n",
        "# ✏️ أضيفي Rule IDs التي تريدين تحليلها\n",
        "# يمكنك قراءتها من ملف\n",
        "test_ids_path = f'{NEEDED_PATH}/test_rule_ids.json'\n",
        "\n",
        "if os.path.isfile(test_ids_path):\n",
        "    with open(test_ids_path) as f:\n",
        "        rule_ids = json.load(f)\n",
        "    print(f'Loaded {len(rule_ids)} rule IDs from test_rule_ids.json')\n",
        "else:\n",
        "    # قواعد تجريبية\n",
        "    rule_ids = ['18205']\n",
        "    print('Using default test rule')\n",
        "\n",
        "print(f'Processing {len(rule_ids)} rules...')\n",
        "\n",
        "export_results = []\n",
        "\n",
        "for rule_id in rule_ids:\n",
        "    try:\n",
        "        with urllib.request.urlopen(f'http://localhost:8000/results/{rule_id}') as r:\n",
        "            data = json.load(r)\n",
        "            data['source'] = 'colab_full_mode'\n",
        "            export_results.append(data)\n",
        "            detected = len(data.get('detected', []))\n",
        "            top = data['mappings'][0] if data['mappings'] else {}\n",
        "            print(f\"  ✅ {rule_id}: {top.get('technique_id','?')} ({top.get('confidence_percent','?')}%) — {detected} detected\")\n",
        "    except Exception as e:\n",
        "        print(f\"  ⚠️  {rule_id}: {e}\")\n",
        "\n",
        "# حفظ النتائج\n",
        "export_path = f'{NEEDED_PATH}/murshid_full_results.json'\n",
        "with open(export_path, 'w', encoding='utf-8') as f:\n",
        "    json.dump(export_results, f, ensure_ascii=False, indent=2)\n",
        "\n",
        "print(f'\\n✅ Exported {len(export_results)} results to:')\n",
        "print(f'   {export_path}')\n",
        "print('\\nيمكنك الآن استيراد هذا الملف في الباكند المحلي')"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## الخلية 12: إيقاف الخادم (عند الانتهاء)\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": null,
      "metadata": {},
      "outputs": [],
      "source": [
        "# إيقاف الخادم وإغلاق ngrok\n",
        "try:\n",
        "    from pyngrok import ngrok\n",
        "    ngrok.kill()\n",
        "    print('✅ ngrok tunnel closed')\n",
        "except Exception:\n",
        "    pass\n",
        "\n",
        "try:\n",
        "    server_proc.terminate()\n",
        "    print('✅ Server stopped')\n",
        "except Exception:\n",
        "    pass"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {},
      "source": [
        "---\n",
        "## ملاحظات مهمة\n",
        "\n",
        "### إذا انقطع الاتصال بـ Colab\n",
        "- الخادم يتوقف تلقائياً\n",
        "- أعيدي تشغيل الخلايا من الخلية 8\n",
        "- رابط ngrok سيتغيّر — عدّلي الفرونت بالرابط الجديد\n",
        "\n",
        "### إذا ظهر خطأ في LLaMA\n",
        "- تأكدي أن لديك صلاحية الوصول للنموذج: https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct\n",
        "- تأكدي من صحة HF_TOKEN\n",
        "\n",
        "### المقارنة مع الجهاز المحلي\n",
        "| | Colab (FULL) | الجهاز المحلي (LOCAL) |\n",
        "|--|-------------|----------------------|\n",
        "| LLaMA | ✅ | ❌ |\n",
        "| T1484 confidence | **94.76%** | 89.29% |\n",
        "| القرار النهائي | T1484 ✅ | T1484 ✅ |\n",
        "\n",
        "### للعرض التقديمي\n",
        "1. شغّلي الخلايا 1-8 مسبقاً (قبل العرض بـ 15 دقيقة)\n",
        "2. انسخي رابط ngrok\n",
        "3. عدّلي الفرونت\n",
        "4. افتحي `https://xxxx.ngrok-free.app/index.html`\n"
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "gpuType": "T4",
      "machine_shape": "hm",
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}