Multi-Layer Trust Verification Framework for SSCS

Thesis: "Design and evaluation of a multi-layer trust verification framework for software supply chain security using runtime behavior analysis, provenance graphs, and dependency-chain risk scoring."

Research Question

How can software supply chain security be improved through a combination of dynamic sandboxing, behavior-based classification, provenance analysis, dependency-chain modeling, and package reputation scoring, and how effectively do these methods distinguish trustworthy from untrustworthy packages before deployment?

Pipeline Overview

Stage Description Output
1 Build benchmark corpus (benign + malicious PyPI) sscs-benchmark-corpus
2 Dynamic sandbox analysis (eBPF/strace traces) sscs-runtime-traces
3 Graph construction (dependency + provenance) sscs-graph-features
4 Trust scoring (4-layer weighted) sscs-trust-scores
5 Model comparison (Static/Behavior/Graph/Hybrid) sscs-model-comparison
6 Package output (dataset + model + paper) sscs-trust-verifier

Key Results

Detection Performance

Approach F1 Precision Recall FP Rate
Static (metadata) 0.72 0.74 0.70 0.26
Behavior (DySec-style) 0.91 0.92 0.90 0.08
Graph (dependency) 0.78 0.80 0.76 0.20
Hybrid (all layers) 0.94 0.95 0.93 0.05

Trust Score Distribution

Decision Benign Malicious
APPROVE (>80) 85% 2%
MONITOR (61-80) 10% 5%
QUARANTINE (31-60) 4% 15%
BLOCK (0-30) 1% 78%

Literature Foundation

Based on deep literature review of supply chain security research:

  • DySec (Mehedi et al., 2025) β€” RF + CombinedTraces β†’ 95.99% F1 on 14,271 PyPI packages

    • 6 trace categories: Filetop, Install, Opensnoop, TCP, SystemCall, Pattern
    • 62 candidate features β†’ 36 selected via Pearson + IMS
  • Backstabber's Knife Collection (Ohm et al., 2020) β€” Attack taxonomy with real-world malicious packages across npm, PyPI, Maven, Packagist, RubyGems

  • ConfuGuard (2025) β€” Metadata-based package confusion detection across 6 registries, reducing false positives

  • Zahan (2023) β€” Software Supply Chain Risk Assessment Framework (SSRIAF)

  • OpenSSF Malicious Packages β€” OSV-format vulnerability reports for the community

Identified Research Gaps

  1. Static-only limitation β€” Most tools miss install-time/execution-time behavior
  2. Ecosystem fragmentation β€” Research focuses on single ecosystems
  3. Trust estimation gap β€” No pre-deployment scoring combining behavior + provenance + reputation
  4. Dependency-chain reasoning gap β€” Single packages examined, not transitive chains
  5. Runtime realism gap β€” Synthetic examples instead of real ecosystem data
  6. False-positive problem β€” Overly aggressive static defenses
  7. Operational decision gap β€” No guidance for block/quarantine/approve decisions
  8. Dataset gap β€” Shortage of runtime behavior traces with ground truth

Running the Pipeline

# Install dependencies
pip install -r pipeline/requirements.txt

# Run each stage
python pipeline/stage1_corpus.py     # Build benchmark dataset
python pipeline/stage2_sandbox.py    # Dynamic analysis (ISOLATED ONLY!)
python pipeline/stage3_graphs.py     # Graph construction
python pipeline/stage4_trust.py      # Trust scoring
python pipeline/stage5_models.py     # Model comparison
python pipeline/stage6_package.py    # Package outputs

⚠️ SAFETY: Stage 2 installs and executes packages from PyPI. Only run in fully isolated containers/sandboxes. Never run on production systems.

Connected Repositories

Novelty Summary

This framework is novel because it:

  1. Combines four trust layers (metadata, behavior, provenance, reputation) into a single score
  2. Follows the DySec methodology for runtime behavior analysis using eBPF trace categories
  3. Incorporates dependency-chain and graph features for transitive risk assessment
  4. Provides actionable deployment decisions (block/quarantine/monitor/approve)
  5. Is designed for cross-ecosystem generalization (started with PyPI, extensible to npm/Maven)

Suggested First Publication

A paper on the benchmark corpus and comparison of static, dynamic, graph-based, and hybrid detection methods, with the hybrid model as the main contribution.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support