PsychGNN All-Tested-Pairs Model

Model summary

This repository contains a heterogeneous graph neural network trained on all tested SNP-disorder pairs on the published psychiatric graph panel.

The model predicts whether a tested SNP-disorder pair is:

  • null, or
  • non-null (suggestive or genome-wide significant)

This is a variant-level research model. It is not a clinical model.

Intended task

The training target is all-tested-pairs cross-disorder association prediction:

  • take the fixed SNP panel used in the graph artifact
  • collect every tested SNP-disorder pair on that panel from the harmonized dataset
  • label pairs as positive if sig_tier ∈ {suggestive, gws}
  • label pairs as negative if sig_tier == null
  • train the graph model to distinguish non-null from null tested pairs across all 11 disorders

Held-out GWS edges are removed from message passing so the graph encoder does not see the target edge directly.

Data provenance

The checkpoint was trained on:

The harmonized dataset was derived from public OpenMed / PGC Hugging Face repositories, including:

Scope

The modeled disorder panel contains 11 disorder groups:

  • ADHD
  • Anxiety
  • Autism
  • Bipolar disorder
  • Borderline personality disorder
  • Eating disorders
  • Major depressive disorder
  • Obsessive-compulsive disorder
  • Post-traumatic stress disorder
  • Schizophrenia
  • Substance use

This evaluation covers all 11 disorders.

Architecture

The encoder is a heterogeneous GraphSAGE-style architecture over:

  • SNP nodes
  • gene nodes
  • disorder nodes

This release uses:

  • hidden dimension: 128
  • layers: 2
  • dropout: 0.15

Decoder heads:

  • bilinear SNP-disorder link decoder for binary classification
  • auxiliary effect-size regression head for non-null edges

No disorder-disorder edges are used in the graph for this release.

Training configuration

Best hyperparameters:

  • hidden dimension: 128
  • layers: 2
  • dropout: 0.15
  • learning rate: 5e-4
  • weight decay: 1e-5

Checkpoint metadata:

  • SNP feature dimension: 7
  • gene feature dimension: 4
  • disorder feature dimension: 5

The positive class is defined as suggestive ∪ gws, so this model is trained as a binary tested-pair classifier rather than an ordinal multi-tier model.

Graph context

Graph metadata for this release:

  • variants: 18,979
  • genes: 1,205
  • disorders: 11
  • SNP-disorder edges: 22,687
  • SNP-gene edges: 65,634
  • disorder-disorder edges: 0
  • GWS threshold for graph construction: 5e-8
  • SNP-gene positional window: 100,000 bp

Evaluation

Primary all-tested-pairs benchmark:

  • test AUROC: 0.9817
  • test AP: 0.9276
  • macro AUROC: 0.9814
  • macro AP: 0.8647
  • effect-size Pearson r on non-null test edges: 0.9883
  • best validation AP: 0.9223

Per-disorder results:

Disorder AUROC AP Test pairs Positive test pairs
ADHD 0.9860 0.9109 2188 239
Anxiety 0.9665 0.9250 2146 632
Autism 0.9978 0.9043 2473 53
Bipolar 0.9605 0.9309 2240 675
BPD 0.9830 0.6608 1921 49
Eating disorders 0.9985 0.8026 2483 4
MDD 0.9754 0.8904 2569 336
OCD 0.9427 0.6659 2148 80
PTSD 0.9998 0.8333 2491 2
Schizophrenia 0.9866 0.9957 2732 2086
Substance use 0.9982 0.9916 679 100

Some disorders still have very small positive test counts, so their AP values should be interpreted cautiously.

Baseline comparison

Baseline Test AUROC Test AP
Disorder prevalence 0.8852 0.6022
Variant prevalence 0.3740 0.1616
Additive prior 0.8170 0.5460
Low-rank SVD 0.5099 0.2408

Class balance on the graph SNP panel

  • ADHD: 1,428 gws, 168 suggestive, 12,994 null
  • Anxiety: 2,729 gws, 1,486 suggestive, 10,097 null
  • Autism: 93 gws, 262 suggestive, 16,138 null
  • Bipolar: 3,091 gws, 1,412 suggestive, 10,437 null
  • BPD: 135 gws, 193 suggestive, 12,484 null
  • Eating disorders: 7 gws, 25 suggestive, 16,533 null
  • MDD: 1,019 gws, 1,225 suggestive, 14,888 null
  • OCD: 35 gws, 501 suggestive, 13,793 null
  • PTSD: 15 gws, 16,599 null
  • Schizophrenia: 13,474 gws, 436 suggestive, 4,307 null
  • Substance use: 661 gws, 11 suggestive, 3,862 null

Inputs and outputs

Inputs

The checkpoint expects:

  • SNP feature matrix
  • gene feature matrix
  • disorder feature matrix
  • SNP-gene edge index
  • SNP-disorder edge index
  • variant and disorder mappings

These are provided by the associated public graph artifact.

Outputs

For a scored (variant, disorder) pair, the model produces:

  • a binary link score for non-null vs null
  • an auxiliary effect-size estimate for non-null edges

The binary link score is the primary output of this release.

How to use

Minimal checkpoint loading:

import torch
from huggingface_hub import hf_hub_download

ckpt_path = hf_hub_download(
    "lighteternal/psychgnn-all-tested-pairs-model",
    "model.pt",
    repo_type="model",
)
checkpoint = torch.load(ckpt_path, map_location="cpu", weights_only=False)

print(checkpoint["hyperparams"])
print(checkpoint["feature_dims"])
print(checkpoint["report"]["all_tested_split"]["macro_ap"])

To run inference, instantiate a heterogeneous GraphSAGE-style model matching the architecture above, load the checkpoint state dict, and score variant-disorder pairs against lighteternal/psychgnn-psychiatric-graph.

Files in this repository

  • model.pt
  • evaluation_report.json

Limitations

  • Positive class combines suggestive and genome-wide significant associations into one label.
  • Several disorders have very small positive test counts even though all 11 are represented.
  • The graph is built from GWS SNP-disorder edges, so encoder context is narrower than the full harmonized dataset.
  • This release does not address zero-shot generalization to unseen disorders.
  • The auxiliary effect head is not a causal estimate.

Appropriate use

Reasonable uses:

  • broad cross-disorder variant scoring across the 11 modeled disorders
  • ranking tested variant-disorder pairs for follow-up
  • exploratory psychiatric genetics analysis
  • comparison against simpler non-graph baselines

Inappropriate uses:

  • patient-level prediction
  • clinical interpretation
  • screening or diagnosis
  • treatment selection
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using lighteternal/psychgnn-all-tested-pairs-model 1