You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Team Red Privacy Gateway Adapter

A QLoRA adapter fine-tuned on top of unsloth/gpt-oss-20b-bnb-4bit (OpenAI GPT-OSS 20B, 4-bit quantized).

Purpose

This adapter implements a local privacy gateway for cybersecurity scan artifacts. It sits between raw scan data and hosted reasoning (Gemini), and:

Detects and replaces institution-specific identifiers with stable placeholders ([ASSET_01], [DOMAIN_01], etc.)
Preserves cybersecurity meaning and semantic context for downstream reasoning
Tags each entity with a reasoning_hint and restore_audiences list
Refuses any reverse-lookup or reveal requests with a structured refusal packet

Training Details

Parameter	Value
Base model	`unsloth/gpt-oss-20b-bnb-4bit`
Method	QLoRA (4-bit base + LoRA r=16, alpha=32)
Training examples	4,800 (train) + 800 (validation)
Epochs	2
Final train loss	0.00145
Training hardware	NVIDIA A100 80GB (Modal.com)
Training time	~2 hours

Usage

from unsloth import FastLanguageModel
from peft import PeftModel
import json

model, tokenizer = FastLanguageModel.from_pretrained(
    "unsloth/gpt-oss-20b-bnb-4bit",
    max_seq_length=3072,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "aavhawkeye/privacy-gateway-adapter")
FastLanguageModel.for_inference(model)

query = {
    "task": "privacy_gateway",
    "query": "Prepare this finding bundle for Gemini. Hide local identifiers but preserve the cybersecurity meaning.",
    "context": { ... }
}

prompt = (
    "### Instruction:\n"
    "You are Team Red's local privacy gateway. Transform the raw cybersecurity context "
    "into the exact JSON packet contract used before hosted reasoning. Preserve meaning, "
    "create stable placeholders, and keep restore audiences on each entity.\n\n"
    f"### Input:\n{json.dumps(query)}\n\n"
    "### Response:\n"
)

inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.05)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(json.loads(response))

Output Contract

Sanitization output

{
  "gateway_mode": "deterministic_local",
  "sanitized_query": "...",
  "sanitized_context": {},
  "entity_count": 6,
  "entities": [
    {
      "placeholder": "[ASSET_01]",
      "type": "ASSET",
      "original_value": "District Sign-In Hub",
      "reasoning_hint": "education / identity_portal / critical",
      "restore_audiences": ["SYSADMIN"]
    }
  ],
  "reasoning_hints": [...]
}

Refusal output (for reverse-lookup attempts)

{
  "decision": "refuse",
  "policy": "no_reverse_lookup",
  "message": "Do not reveal, reconstruct, or export protected identifiers from the local placeholder map.",
  "safe_next_step": "Keep the placeholder map local and use deterministic local rehydration for approved sysadmins."
}

Hardware Requirements

Minimum ~18GB VRAM in 4-bit mode. Tested on A100 40GB and A100 80GB.

Downloads last month: 8

Model tree for aavhawkeye/privacy-gateway-adapter

Base model

openai/gpt-oss-20b

Quantized

unsloth/gpt-oss-20b-bnb-4bit

Adapter

(1)

this model