You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Team Red Privacy Gateway Adapter

A QLoRA adapter fine-tuned on top of unsloth/gpt-oss-20b-bnb-4bit (OpenAI GPT-OSS 20B, 4-bit quantized).

Purpose

This adapter implements a local privacy gateway for cybersecurity scan artifacts. It sits between raw scan data and hosted reasoning (Gemini), and:

  • Detects and replaces institution-specific identifiers with stable placeholders ([ASSET_01], [DOMAIN_01], etc.)
  • Preserves cybersecurity meaning and semantic context for downstream reasoning
  • Tags each entity with a reasoning_hint and restore_audiences list
  • Refuses any reverse-lookup or reveal requests with a structured refusal packet

Training Details

Parameter Value
Base model unsloth/gpt-oss-20b-bnb-4bit
Method QLoRA (4-bit base + LoRA r=16, alpha=32)
Training examples 4,800 (train) + 800 (validation)
Epochs 2
Final train loss 0.00145
Training hardware NVIDIA A100 80GB (Modal.com)
Training time ~2 hours

Usage

from unsloth import FastLanguageModel
from peft import PeftModel
import json

model, tokenizer = FastLanguageModel.from_pretrained(
    "unsloth/gpt-oss-20b-bnb-4bit",
    max_seq_length=3072,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "aavhawkeye/privacy-gateway-adapter")
FastLanguageModel.for_inference(model)

query = {
    "task": "privacy_gateway",
    "query": "Prepare this finding bundle for Gemini. Hide local identifiers but preserve the cybersecurity meaning.",
    "context": { ... }
}

prompt = (
    "### Instruction:\n"
    "You are Team Red's local privacy gateway. Transform the raw cybersecurity context "
    "into the exact JSON packet contract used before hosted reasoning. Preserve meaning, "
    "create stable placeholders, and keep restore audiences on each entity.\n\n"
    f"### Input:\n{json.dumps(query)}\n\n"
    "### Response:\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.05)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(json.loads(response))

Output Contract

Sanitization output

{
  "gateway_mode": "deterministic_local",
  "sanitized_query": "...",
  "sanitized_context": {},
  "entity_count": 6,
  "entities": [
    {
      "placeholder": "[ASSET_01]",
      "type": "ASSET",
      "original_value": "District Sign-In Hub",
      "reasoning_hint": "education / identity_portal / critical",
      "restore_audiences": ["SYSADMIN"]
    }
  ],
  "reasoning_hints": [...]
}

Refusal output (for reverse-lookup attempts)

{
  "decision": "refuse",
  "policy": "no_reverse_lookup",
  "message": "Do not reveal, reconstruct, or export protected identifiers from the local placeholder map.",
  "safe_next_step": "Keep the placeholder map local and use deterministic local rehydration for approved sysadmins."
}

Hardware Requirements

Minimum ~18GB VRAM in 4-bit mode. Tested on A100 40GB and A100 80GB.

Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for aavhawkeye/privacy-gateway-adapter

Adapter
(1)
this model