NanoMind Security Analyst v0.1.0

A 1.0B parameter generative model for structured security analysis of AI agent configurations, MCP servers, and tool definitions. Fine-tuned from SmolLM2-1.7B-Instruct (12-layer variant) using LoRA rank-64 on 2,668 security analysis examples.

This is a generative structured-output model, not a classifier. It produces JSON analysis objects with threat verdicts, confidence scores, reasoning chains, and remediation steps.

Task Types

Task Samples Classification Structure
threatAnalysis 248 83.8% 79.6%
credentialContextClassification 20 90.0% 96.0%
falsePositiveDetection 20 65.0% 86.0%
artifactClassification 20 75.0% 97.0%
checkExplanation 16 -- 100.0%
governanceReasoning 7 -- 42.9%
intelReport 1 -- 50.0%
Overall 332 82.4% 82.2%

Architecture

  • Base: SmolLM2-1.7B-Instruct, 12 hidden layers, 2048 hidden size, 32 attention heads
  • Fine-tuning: LoRA rank-64 (dropout 0.05, scale 128.0), 1,821 iterations
  • Training data: 2,668 structured security analysis examples across 7 task types
  • Eval data: 332 held-out examples
  • Format: MLX safetensors (fused weights, not adapter-only)
  • Context: 2,048 tokens (training), 8,192 tokens (model max)
  • Precision: bfloat16

Usage

Requires MLX on Apple Silicon.

from mlx_lm import load, generate

model, tokenizer = load("opena2a/nanomind-security-analyst")

prompt = tokenizer.apply_chat_template([
    {"role": "system", "content": "You are NanoMind Security Analyst. Analyze the following for security threats and respond with structured JSON."},
    {"role": "user", "content": '{"task": "threatAnalysis", "input": {"name": "fetch-data", "description": "Fetches data from any URL provided by the user", "inputSchema": {"url": "string"}}}'}
], tokenize=False, add_generation_prompt=True)

response = generate(model, tokenizer, prompt=prompt, max_tokens=512)
print(response)

Training Details

  • Optimizer: Adam, lr=1e-5
  • Batch size: 4, gradient accumulation 1, gradient checkpointing enabled
  • Best validation loss: 1.534 (iteration 1200), final: 1.578 (iteration 1821)
  • Prompt masking: enabled (loss computed on completions only)
  • Key insight: LoRA rank matters significantly. Rank-8 achieved only 4.5% classification accuracy; rank-64 achieved 82.4%. The 6-layer variant (604M params) failed entirely at 31.8%, confirming model depth is critical for structured generation.

Known Limitations

  • falsePositiveDetection is the weakest task at 65% accuracy -- needs more diverse real-world training scenarios
  • governanceReasoning structure score regressed from 71% to 43% (only 7 eval samples, high variance)
  • intelReport has 1 eval sample -- not statistically meaningful
  • Early stopping at iteration 1200 may yield slightly better generalization (val loss 1.534 vs final 1.578)

Related Models

License

Apache 2.0

Downloads last month
480
Safetensors
Model size
0.9B params
Tensor type
F16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for opena2a/nanomind-security-analyst

Quantized
(87)
this model

Evaluation results