VynFi JE Fraud GNN — GraphSAGE edge classifier + GAE node anomaly scorer

Trained on the v5.9.0 Method-A accounting network from VynFi/vynfi-journal-entries-1m. Two complementary models in one bundle:

Task	Model	Test AUC-ROC	Test AUC-PR	Notes
Edge fraud classification (supervised)	GraphSAGE → edge head	0.9136	0.7949	Beats LR baseline by +0.13 AUC pts (LR already strong because weekend + round-dollar features are highly discriminative).
Edge anomaly scoring (unsupervised)	Attribute-reconstruction GAE	0.6540	0.1434	Pure unsupervised — no `is_fraud`/`is_anomaly` labels seen at train time. Surfaces edges whose attributes are unusual given their structural neighborhood.

Per-business-process breakdown (edge fraud classifier, test split)

Process	AUC-ROC	AUC-PR	F1	n
P2P	0.9289	0.8146	0.8041	2,835
O2C	0.8965	0.7660	0.7423	3,155
R2R	0.9301	0.8113	0.8000	1,895
H2R	0.8859	0.7517	0.7523	914
A2R	0.9512	0.9273	0.9565	450

Architecture

Fraud classifier — EdgeFraudGNN:

2-layer GraphSAGE encoder (mean aggregator) → 64-dim node embeddings.
Edge head: MLP on concat(emb_src, emb_dst, edge_attr) → sigmoid.
BCE loss with positive-class weight ≈ 16.3 (5.79 % fraud rate).

Anomaly scorer — AttrGAE:

Same 2-layer GraphSAGE encoder.
MLP decoder predicts edge_attr from concat(z_src, z_dst).
MSE loss; per-edge reconstruction error ranks anomalous edges (high error = unusual attributes given structural context).

Both models share the same node feature space (17 dims): account-type one-hot · structural flags · hierarchy level · log-aggregated in/out flows.

Edge features (22 dims): log-amount · is-round-dollar · per-level round flags · confidence · business-process one-hot · day-of-year sin/cos · week-of-year sin/cos · day-of-week sin/cos · is-weekend.

Quick start

from huggingface_hub import snapshot_download
from scripts.ml.inference import load_bundle

local_dir = snapshot_download(repo_id="VynFi/je-fraud-gnn")
bundle = load_bundle(local_dir)

# Predict fraud probability for one or more edges
probs = bundle.predict_fraud(
    from_account=["1000", "5000"],
    to_account=["2000", "4000"],
    amount=[7432.89, 25000.00],            # second is a "round" amount
    business_process=["P2P", "O2C"],
    posting_date=["2024-03-15", "2024-08-10"],
)
print(probs)  # array([0.13, 0.99]) — round amount → strong fraud signal

# Per-edge anomaly score (high MSE = unusual attribute combination)
mse = bundle.anomaly_score_edges(
    from_account=["1000", "5000"],
    to_account=["2000", "4000"],
    amount=[7432.89, 25000.00],
    business_process=["P2P", "O2C"],
    posting_date=["2024-03-15", "2024-08-10"],
)
print(mse)

The scripts/ml/inference.py module is shipped in the engine repo.

Training data

Sourced from VynFi/vynfi-journal-entries-1m v5.9.0:

499 GL accounts (after dedupe of 4 conflicting account_number rows in the COA)
61,656 Method-A edges (one edge per 2-line journal entry)
5.79 % fraud rate (3,571 / 61,656)
6.52 % anomaly rate
Stratified 70/15/15 train/val/test split on is_fraud (seed = 20260509)
Generated under the v5.9.0 release tag (ChaCha8 PRNG, platform-stable)

Why does the GraphSAGE encoder add only marginal lift over LR?

Honest answer: the synthetic fraud bias in DataSynth v5.x writes strong, local signals into edge attributes:

fraud_bias.weekend_bias=0.30 → 41 % of fraud edges land on Sat/Sun vs 0.5 % of non-fraud (77× lift) fraud_bias.round_dollar_bias=0.40 → 55 % of fraud edges hit a $1K/$5K/$10K/$25K/$50K/$100K canonical level vs 0.14 % (378× lift)

A LogisticRegression with day-of-week + round-dollar features already gets to AUC 0.912 — there's not much room left for the graph encoder to add value on the supervised task. The GraphSAGE encoder adds +0.13 AUC pts and +0.84 AUC-PR pts; the per-process breakdown is where it shines (A2R stretches to 0.95 AUC).

Where the graph contribution does show up:

Unsupervised anomaly detection. The attribute-reconstruction GAE reaches AUC-ROC 0.654 on edge-level anomaly with no labels at train time — the structural prior is doing the work.
Top-K anomalous accounts. The GAE's per-node aggregated MSE (mean across incident test edges) ranks accounts by structural weirdness; precision@10 = 0.60 against the median anomaly-fraction threshold.

For deployment scenarios where you have crisp labels and fraud patterns are local to single transactions, an LR baseline may be competitive. For labelless or graph-context fraud (multi-hop laundering, ring transactions), the GNN signal is the differentiator.

Limitations

Trained on a single 1M-JE generator run. Generalisation to other v5.9.0 datasets has not been evaluated.
is_fraud labels come from DataSynth's fraud-bias mechanism — they reflect known bias signatures (weekend / round-dollar / off-hours / post-close), not the full universe of real-world fraud patterns.
Account vocabulary is fixed at the 499 nodes in the published COA. Inference on unseen account_number values raises ValueError.
Per-node anomaly AUC is close to random (0.48) — the per-edge signal is the load-bearing one. For ranking accounts, use precision@K instead of AUC.

Reproducibility

git clone https://github.com/mivertowski/SyntheticData.git
cd SyntheticData
pip install -r requirements-ml.txt
python -m scripts.ml.build_je_pyg_dataset --output data/ml/je_pyg_v1.pt --seed 20260509
python -m scripts.ml.train_je_fraud_gnn --epochs 60
python -m scripts.ml.train_je_anomaly_gae --epochs 80
python -m scripts.ml.package_for_hf

Citation

@misc{ivertowski2026datasynth,
  author       = {Ivertowski, Michael},
  title        = {{DataSynth}: Reference Knowledge Graphs for Enterprise
                  Audit Analytics through Synthetic Data Generation
                  with Provable Statistical Properties},
  year         = {2026},
  month        = {April},
  howpublished = {SSRN Working Paper},
  url          = {https://ssrn.com/abstract=6538639}
}

License

Apache-2.0.

VynFi/vynfi-journal-entries-1m — training dataset
VynFi/accounting-network-explorer — interactive class-level network viewer
VynFi/fraud-gnn-demo — Gradio inference Space (companion)
Engine repo · SSRN paper

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

Graph Machine Learning

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

VynFi
/

je-fraud-gnn