Team Red Privacy Gateway Adapter
A QLoRA adapter fine-tuned on top of unsloth/gpt-oss-20b-bnb-4bit (OpenAI GPT-OSS 20B, 4-bit quantized).
Purpose
This adapter implements a local privacy gateway for cybersecurity scan artifacts. It sits between raw scan data and hosted reasoning (Gemini), and:
- Detects and replaces institution-specific identifiers with stable placeholders (
[ASSET_01],[DOMAIN_01], etc.) - Preserves cybersecurity meaning and semantic context for downstream reasoning
- Tags each entity with a
reasoning_hintandrestore_audienceslist - Refuses any reverse-lookup or reveal requests with a structured refusal packet
Training Details
| Parameter | Value |
|---|---|
| Base model | unsloth/gpt-oss-20b-bnb-4bit |
| Method | QLoRA (4-bit base + LoRA r=16, alpha=32) |
| Training examples | 4,800 (train) + 800 (validation) |
| Epochs | 2 |
| Final train loss | 0.00145 |
| Training hardware | NVIDIA A100 80GB (Modal.com) |
| Training time | ~2 hours |
Usage
from unsloth import FastLanguageModel
from peft import PeftModel
import json
model, tokenizer = FastLanguageModel.from_pretrained(
"unsloth/gpt-oss-20b-bnb-4bit",
max_seq_length=3072,
load_in_4bit=True,
)
model = PeftModel.from_pretrained(model, "aavhawkeye/privacy-gateway-adapter")
FastLanguageModel.for_inference(model)
query = {
"task": "privacy_gateway",
"query": "Prepare this finding bundle for Gemini. Hide local identifiers but preserve the cybersecurity meaning.",
"context": { ... }
}
prompt = (
"### Instruction:\n"
"You are Team Red's local privacy gateway. Transform the raw cybersecurity context "
"into the exact JSON packet contract used before hosted reasoning. Preserve meaning, "
"create stable placeholders, and keep restore audiences on each entity.\n\n"
f"### Input:\n{json.dumps(query)}\n\n"
"### Response:\n"
)
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=1024, temperature=0.05)
response = tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True)
print(json.loads(response))
Output Contract
Sanitization output
{
"gateway_mode": "deterministic_local",
"sanitized_query": "...",
"sanitized_context": {},
"entity_count": 6,
"entities": [
{
"placeholder": "[ASSET_01]",
"type": "ASSET",
"original_value": "District Sign-In Hub",
"reasoning_hint": "education / identity_portal / critical",
"restore_audiences": ["SYSADMIN"]
}
],
"reasoning_hints": [...]
}
Refusal output (for reverse-lookup attempts)
{
"decision": "refuse",
"policy": "no_reverse_lookup",
"message": "Do not reveal, reconstruct, or export protected identifiers from the local placeholder map.",
"safe_next_step": "Keep the placeholder map local and use deterministic local rehydration for approved sysadmins."
}
Hardware Requirements
Minimum ~18GB VRAM in 4-bit mode. Tested on A100 40GB and A100 80GB.
- Downloads last month
- 8