alzheimer-research-complete / AD_DATASET_ACCESS_GUIDE.md
Satyawan1's picture
Upload AD_DATASET_ACCESS_GUIDE.md with huggingface_hub
feeb18f verified
# Alzheimer's Disease & Neuroscience Dataset Access Guide
**Compiled: 2026-04-05**
---
## TABLE OF CONTENTS
1. [NACC](#1-nacc)
2. [OASIS-3](#2-oasis-3)
3. [Bio-Hermes-001](#3-bio-hermes-001)
4. [ADSP / NIAGADS](#4-adsp--niagads)
5. [HCP-Aging](#5-hcp-aging)
6. [PREVENT-AD](#6-prevent-ad)
7. [ANMerge (AddNeuroMed)](#7-anmerge-addneuromed)
8. [ROSMAP](#8-rosmap)
9. [Mayo Clinic Study of Aging (MCSA)](#9-mayo-clinic-study-of-aging)
10. [AIBL](#10-aibl)
11. [DIAN](#11-dian)
12. [AD Workbench / AD Discovery Portal](#12-ad-workbench--ad-discovery-portal)
13. [GAAIN](#13-gaain)
14. [AMP-AD (via Synapse)](#14-amp-ad-via-synapse)
15. [ADNI](#15-adni---bonus-essential-dataset)
16. [Allen SEA-AD](#16-allen-sea-ad)
17. [UK Biobank](#17-uk-biobank)
18. [EEG/MEG Datasets](#18-eegmeg-datasets-for-ad)
19. [Kaggle Datasets](#19-kaggle-ad-datasets)
20. [Additional/Recent Datasets](#20-additional--recent-datasets-2025-2026)
---
## 1. NACC
**National Alzheimer's Coordinating Center**
| Field | Details |
|-------|---------|
| **Subjects** | 54,000+ from 39+ Alzheimer's Disease Research Centers |
| **Apply URL** | https://naccdata.org/requesting-data/nacc-data/ |
| **DUA PDF** | https://files.alz.washington.edu/documentation/nacc_data_use_agreement.pdf |
### Application Requirements
- Sign a Data Use Agreement (DUA) -- takes ~15 minutes
- Submit electronic data request describing your research project
- Must search for similar existing projects to ensure no significant overlap
- Must identify a unique research hypothesis
- Any researcher affiliated with a scientific/educational institution
### Approval Timeline
- NACC acknowledges request within **3 business days**
- Approved researchers receive data within **48 hours** (excluding weekends/holidays)
- One of the fastest turnarounds of any AD dataset
### What You Get
- **Quick Access File**: UDS (Uniform Data Set) standardized longitudinal data (CSV)
- **Imaging**: MRI (T1w, FLAIR, DTI, T2) and PET scans in DICOM and NIfTI (.zip archives)
- **Biomarkers**: CSF biomarkers, APOE genotypes, fluid biomarker data (via NCRAD)
- **Genomics**: Genetic/genomic data (via NIAGADS)
- **Disease modules**: FTLD, Lewy Body Dementia, Down Syndrome modules
- **Modalities**: Clinical, neuropathology, MRI/PET, biospecimen, digital, EHR/Claims
### Restrictions
- Data solely for identified individuals in the request
- Must acknowledge NACC in publications
- Must notify NACC before and during publication submission
- Cannot attempt to re-identify participants
---
## 2. OASIS-3
**Open Access Series of Imaging Studies - 3**
| Field | Details |
|-------|---------|
| **Subjects** | 1,378 participants (ages 42-95); 755 cognitively normal, 622 with cognitive decline |
| **Apply URL** | https://www.nitrc.org/projects/oasis3/ (main) |
| **Tau PET Access** | https://sites.wustl.edu/oasisbrains/home/oasis-3/request-tau-access/ |
| **Contact** | oasis-brains@nrg.wustl.edu |
### Application Requirements
- Register through NITRC (Neuroimaging Informatics Tools and Resources Clearinghouse)
- Click "REQUEST ACCESS TO DATASETS" on the OASIS website
- For Tau PET data: send a detailed research statement to oasis-brains@nrg.wustl.edu
- Must already have OASIS-3 main dataset approval before requesting Tau data
### Approval Timeline
- Not explicitly stated; typically **1-2 weeks** for NITRC registration
- Tau data requires separate approval
### What You Get
- **MRI**: 2,842 sessions -- T1w, T2w, FLAIR, ASL, SWI, time-of-flight, resting-state BOLD, DTI
- **PET**: 2,157+ scans -- PIB, AV45 (amyloid), FDG, AV1451 (Tau; 451 baseline + 85 longitudinal)
- **CT**: 1,472 sessions
- **FreeSurfer**: Volumetric segmentations for many MR sessions
- **Clinical**: Cognitive assessments, UDS forms
- **Formats**: NIfTI, BIDS-compatible
- **Size**: Multiple TB total
### Restrictions
- Cannot attempt to identify participants (including facial recognition/3D rendering)
- Publications using AV45 or AV1451 PET data must be submitted to **Avid Radiopharmaceuticals** for review **30 days** before publication/presentation
- As of Dec 2025, Tau data integrated into main OASIS-3 project
---
## 3. Bio-Hermes-001
**Global Alzheimer's Platform Foundation**
| Field | Details |
|-------|---------|
| **Subjects** | 80,000+ blood and digital test results |
| **Apply URL** | https://www.alzheimersdata.org/ad-workbench (via AD Discovery Portal) |
| **Study Page** | https://globalalzplatform.org/biohermesstudy/ |
### Application Requirements
- Create a free account on AD Workbench (alzheimersdata.org)
- Request access through the AD Discovery Portal
- Accept Terms of Use
- Available since August 1, 2025
### Approval Timeline
- Varies by dataset; generally **days to weeks** after account approval
### What You Get
- **Blood biomarkers**: Comprehensive plasma/serum biomarker panels
- **Digital cognitive tests**: Novel digital cognitive assessment data
- **Retinal imaging**: Retinal exam data
- **Speech analysis**: Voice/speech biomarker data
- **PET imaging**: Traditional amyloid/tau PET
- **Deeply diverse cohort**: One of the most diverse AD datasets available
- Most comprehensive biomarker dataset in Alzheimer's research history
### Restrictions
- Must use data for Alzheimer's/dementia research purposes
- Must accept AD Workbench Terms of Use
- Free access, no cost
---
## 4. ADSP / NIAGADS
**Alzheimer's Disease Sequencing Project / NIA Genetics of AD Data Storage Site**
| Field | Details |
|-------|---------|
| **Subjects** | 110,270+ (58,507 whole genomes + 20,503 whole exomes + more) |
| **Apply URL** | https://dss.niagads.org/ |
| **Application Instructions** | https://dss.niagads.org/documentation/data-application-and-submission/application-instructions/ |
| **ADSP Umbrella Dataset** | https://dss.niagads.org/datasets/ng00067/ |
### Application Requirements
1. Current **IRB approval** and protocol for the proposed project (must have 6+ months remaining)
2. **NIA Genomic Data Sharing Plan** (signed by PI and Institutional Signing Official)
3. **NIAGADS Data Distribution Agreement**
4. **Derived/Secondary Data Return Plan** describing what data you will return
5. Research use statement (technical and non-technical)
### Approval Timeline
- Applications reviewed by Data Access Committee
- Typically **4-8 weeks** (depends on completeness of application)
- Approval valid for **1 year** (renewable)
- IRB must have 6+ months validity at time of review
### What You Get
- **Whole Genome Sequences**: 58,507 in CRAMs, gVCFs
- **Whole Exome Sequences**: 20,503 in CRAMs, gVCFs
- **Quality-controlled VCFs**: Project-level variant calls
- **Harmonized phenotypes**: Standardized clinical data
- **Formats**: CRAM, gVCF, VCF, CSV
### Restrictions
- Must return derived/secondary data to NIAGADS upon publication or DAR expiration
- Data only for approved research use statement
- Local IRB approval required
- Cannot share with unauthorized users
---
## 5. HCP-Aging
**Human Connectome Project - Aging / AABC**
| Field | Details |
|-------|---------|
| **Subjects** | 1,396+ adults (ages 36-100+), 2,878 sessions |
| **Apply URL** | https://nda.nih.gov/ (via NIMH Data Archive permissions dashboard) |
| **Data Use Terms** | https://www.humanconnectome.org/study/hcp-lifespan-aging/data-use-terms |
| **Instructions** | https://www.humanconnectome.org/study/hcp-lifespan-aging/article/instructions-accessing-hcp-aging-data-releases-nda |
### Application Requirements
1. Create NDA (NIMH Data Archive) account
2. Submit **Data Use Certification (DUC)** via NDA Permissions Dashboard
3. Describe how data will be accessed, managed, and eventually deleted
4. Institutional sign-off required
5. Annual renewal with progress report
### Approval Timeline
- **2-4 weeks** for DUC approval (varies)
- Contact: NDAhelp@mail.nih.gov for status
### What You Get
- **Lifespan 2.0 Release**: 725 HCP-A participants + AABC Release 2 (1,396 participants)
- **Structural MRI**: T1w, T2w, high-res hippocampal T2
- **Functional MRI**: Resting state fMRI, task fMRI
- **Diffusion MRI**: DTI
- **ASL**: Arterial spin labeling perfusion
- **Phenotypic data**: Demographics, behavioral assessments
- **Total size**: 22+ TB
- **Formats**: NIfTI, CIFTI, CSV
### Restrictions
- Must adhere to consent-based data use limitations
- Cannot attempt to re-identify participants
- Must describe data management/deletion plan
- Annual renewal required
---
## 6. PREVENT-AD
**Pre-symptomatic Evaluation of Novel Treatments for AD**
| Field | Details |
|-------|---------|
| **Subjects** | 349 cognitively healthy at-risk participants (mean age 63) |
| **Open Data Portal** | https://openpreventad.loris.ca |
| **Registered Data Portal** | https://registeredpreventad.loris.ca |
| **Publication** | https://doi.org/10.1016/j.nicl.2021.102733 |
### Application Requirements
- **Open imaging data**: Freely accessible at openpreventad.loris.ca -- just register
- **Sensitive data** (CSF, genetics, cognition): Apply at registeredpreventad.loris.ca
- Must agree to standard good data use practices
- Must meet ethics requirements and keep data secure
- Findable through Canadian Open Neuroscience Platform (CONP)
### Approval Timeline
- **Open data**: Immediate after registration
- **Registered data**: Days to weeks for qualified researcher approval
### What You Get
- **Imaging**: Up to 5 years longitudinal MRI data (structural, functional)
- **Biomarkers**: Cerebrospinal fluid biochemistry
- **Genetics**: Genetic information
- **Cognitive**: Neurocognitive assessments
- **Neurosensory**: Sensory capacity measurements
- **Medical**: Clinical/medical information
- **Formats**: BIDS-compatible NIfTI, CSV
### Restrictions
- Must use for neuroscience research as stipulated in consent forms
- Cannot attempt to re-identify participants
- Must acknowledge PREVENT-AD in publications
---
## 7. ANMerge (AddNeuroMed)
| Field | Details |
|-------|---------|
| **Subjects** | 1,702 participants |
| **Access URL** | https://doi.org/10.7303/syn22252881 (via Synapse) |
| **Publication** | https://doi.org/10.3233/JAD-200948 |
### Application Requirements
- Create a free Synapse account (accounts.synapse.org)
- Accept data use conditions on Synapse
- No complex approval process -- this is an open-access dataset
### Approval Timeline
- **Immediate to days** -- relatively straightforward once Synapse account is set up
### What You Get
- **Clinical assessments**: Longitudinal observational cohort data
- **MRI**: Magnetic resonance imaging
- **Genotyping**: Genetic variants
- **Transcriptomics**: Gene expression profiling (whole-blood RNA)
- **Proteomics**: Blood plasma proteomics
- **Formats**: CSV, processed data tables
- Fully interoperable between modalities with rigorous data curation
### Restrictions
- Must cite the dataset and primary publication
- Standard Synapse data use conditions apply
---
## 8. ROSMAP
**Religious Orders Study / Memory and Aging Project**
| Field | Details |
|-------|---------|
| **Subjects** | 3,600+ participants (longitudinal since early 1990s) |
| **AD Knowledge Portal** | https://adknowledgeportal.synapse.org/Explore/Studies/DetailsPage/StudyDetails?Study=syn3219045 |
| **RADC Hub** | https://www.radc.rush.edu/ |
| **Molecular Networks** | https://www.radc.rush.edu/molecular_networks/datasets.html |
### Application Requirements
- **Omics data on Synapse**: Register for free Synapse account; some datasets require a signed Data Use Certificate (DUC)
- **Clinical/demographic data**: Request through Rush Alzheimer's Disease Center (RADC) Research Resource Sharing Hub
- **Additional phenotypes**: Separate request through RADC
### Approval Timeline
- **Synapse open data**: Days (account registration)
- **DUC-protected data**: 2-4 weeks for approval
- **RADC data**: Variable, typically weeks
### What You Get
- **Genomics**: Whole-genome sequencing
- **Transcriptomics**: RNA-seq, single-nucleus RNA-seq
- **Epigenomics**: DNA methylation, histone modifications, chromatin accessibility
- **Proteomics**: Mass spectrometry-based proteomics
- **Metabolomics**: Metabolite profiling
- **Clinical**: Longitudinal clinical assessments (since 1990s)
- **Neuropathology**: Post-mortem brain tissue analyses
- **Formats**: FASTQ, BAM, VCF, CSV, H5AD
### Restrictions
- Must acknowledge data source in publications
- DUC-protected datasets (esp. Mayo/Broad samples from deceased individuals) have additional consent requirements
- Must use for research purposes consistent with informed consent
---
## 9. Mayo Clinic Study of Aging (MCSA)
| Field | Details |
|-------|---------|
| **Subjects** | 5,925 unique participants (clinical); 1,802 (imaging); ages 30-90 |
| **GAAIN Access** | https://www.gaaindata.org/partner/MCSA |
| **LONI IDA** | https://ida.loni.usc.edu/collaboration/access/appApply.jsp?project=MCSA |
| **Synapse** | https://adknowledgeportal.synapse.org/Explore/Studies/DetailsPage?Study=syn22024536 |
### Application Requirements
- Apply via LONI IDA or GAAIN
- Agree to Data Use Agreement
- Describe research project and collaborators
- Must be a qualified academic or industry researcher
### Approval Timeline
- **2-4 weeks** for LONI IDA review
- GAAIN requests reviewed individually
### What You Get
- **Clinical data**: Longitudinal data, 1-12 visits at ~15-month intervals
- **MRI**: T1w, T2w-FLAIR, diffusion MRI from 1,802 participants
- **Future releases**: De-faced amyloid PET images planned
- **Molecular data**: Whole-genome genotype + gene expression from 2,655 individuals (842M+ datapoints)
- **Interactive tool**: Multiomic Atlas of AD Brain Endophenotypes (free web app)
- **Formats**: NIfTI, CSV
### Restrictions
- Standard DUA restrictions apply
- Cannot re-identify participants
- Must acknowledge Mayo Clinic in publications
---
## 10. AIBL
**Australian Imaging, Biomarkers and Lifestyle**
| Field | Details |
|-------|---------|
| **Subjects** | 1,112 inception cohort (expanded to 2,359+) |
| **Apply URL** | https://ida.loni.usc.edu/collaboration/access/appApply.jsp?project=AIBL |
| **Joint AIBL+ADNI** | https://ida.loni.usc.edu/collaboration/access/appApply.jsp?project=AIBL&project=ADNI |
| **Website** | https://aibl.org.au/ |
### Application Requirements
- Apply via LONI IDA online form
- Read and agree to AIBL project Terms of Use (carefully)
- Describe your research and list collaborators
- Can apply jointly for AIBL + ADNI data
### Approval Timeline
- **2-4 weeks** (similar to ADNI process via LONI)
### What You Get
- **Neuroimaging**: Amyloid PET (PiB, flutemetamol), FDG PET, structural MRI
- **Blood biomarkers**: Amyloid-beta, tau, inflammatory markers
- **CSF biomarkers**: Cerebrospinal fluid analytes
- **Cognitive assessments**: Battery of neuropsychological tests
- **Genetics**: APOE genotyping and broader genetic data
- **Lifestyle**: Diet, exercise, sleep, social engagement data
- **Longitudinal**: Assessments every 18 months
- **Formats**: DICOM/NIfTI (imaging), CSV (clinical)
### Restrictions
- Must comply with AIBL Terms of Use
- Acknowledge AIBL in publications
- Cannot re-identify participants
---
## 11. DIAN
**Dominantly Inherited Alzheimer Network**
| Field | Details |
|-------|---------|
| **Subjects** | 533 individuals across 206 families (autosomal dominant AD) |
| **Data Request Form** | https://dian.wustl.edu/dian-observational-data-request-form/ |
| **Website** | https://dian.wustl.edu |
| **Investigator Resources** | https://dian.wustl.edu/for-investigators/ |
### Application Requirements
- Submit DIAN Observational Data Request Form
- Research proposal reviewed by DIAN Obs Resource Committee
- Must be a qualified researcher
- Must accept and comply with DIAN data sharing/publication policies
- Strict publication policy: all publications using DIAN data must follow their guidelines
### Approval Timeline
- **Weeks to months** -- committee review required
- More restrictive than most datasets
### What You Get
- **MRI**: Structural and functional brain imaging
- **PET**: Amyloid and tau PET scans
- **Clinical**: Longitudinal cognitive/clinical assessments
- **Biofluid**: CSF and blood biomarkers
- **Genetics**: Deep genetic phenotyping (PSEN1, PSEN2, APP mutations)
- **15+ years** of longitudinal data on autosomal dominant AD
- **Unique value**: Only large-scale dataset on dominantly inherited (genetic) AD
### Restrictions
- Strict publication and authorship policies
- Violations can result in being barred from future data/biospecimen requests
- Potential institutional involvement or legal action for policy deviations
- Must comply with DIAN-TU Data and Biospecimen Sharing Policy
---
## 12. AD Workbench / AD Discovery Portal
| Field | Details |
|-------|---------|
| **Datasets** | 100+ novel datasets across multiple modalities |
| **Portal URL** | https://discover.alzheimersdata.org |
| **Workbench URL** | https://www.alzheimersdata.org/ad-workbench |
| **How-To Guide** | https://www.alzheimersdata.org/how-to-use-the-ad-workbench |
### Application Requirements
- Create a free user account on AD Workbench
- Account creation is automatic; data permissions require review
- Use FAIR Search for data discovery
- Request workspace for analysis
- Accept Terms of Use
### Approval Timeline
- **Account creation**: Immediate
- **Data access permissions**: Variable per dataset (days to weeks)
### What You Get
- **100+ datasets**: Imaging, omics, clinical, multi-modal
- **Cloud workspaces**: Secure, private analysis environments
- **Tools**: Data visualization, curation, combination tools
- **Bio-Hermes-001**: Available through this portal
- **Free**: No cost for any tool or data access
### Restrictions
- Must be approved by ADDI
- Data use per individual dataset terms
- Research purposes only
---
## 13. GAAIN
**Global Alzheimer's Association Interactive Network**
| Field | Details |
|-------|---------|
| **Subjects** | ~500,000 from nearly 50 institutions worldwide |
| **Portal URL** | https://www.gaaindata.org |
| **Website** | https://gaain.org |
### Application Requirements
- Register on gaaindata.org
- Free access for researchers worldwide
- Use the Interrogator tool for cohort discovery and analysis
### Approval Timeline
- **Immediate to days** for most federated queries
- Individual partner datasets may have their own access requirements
### What You Get
- **Federated data platform**: Query across multiple cohorts simultaneously
- **Clinical data**: Cognitive scores, demographics, diagnoses
- **Imaging data**: Various imaging modalities from partner studies
- **Genomics**: Genetic data from contributing cohorts
- **Analytics tools**: Built-in analytics and visualization
- **Partner datasets**: MCSA, ADNI, and many international cohorts
### Restrictions
- Individual partner datasets retain their own data use policies
- Cannot download all raw data -- federated query model
- Must acknowledge GAAIN and contributing studies
---
## 14. AMP-AD (via Synapse)
**Accelerating Medicines Partnership - Alzheimer's Disease**
| Field | Details |
|-------|---------|
| **Portal URL** | https://adknowledgeportal.synapse.org |
| **Data Access Instructions** | https://adknowledgeportal.synapse.org/DataAccess/Instructions |
| **Account Registration** | https://accounts.synapse.org/ |
### Application Requirements
- Register for free Synapse account
- Browse public content freely
- Download data requires Synapse login
- Some datasets require signed Data Use Certificate (DUC)
- DUC datasets (esp. from Mayo Clinic/Broad Institute with deceased donor samples) have additional review
### Approval Timeline
- **Open data**: Immediate after registration
- **DUC-protected data**: 1-4 weeks
### What You Get
- **Multi-omics**: Genomics, transcriptomics, epigenomics, proteomics, metabolomics
- **Studies include**: ROSMAP, MayoRNAseq, MSBB (Mount Sinai Brain Bank), many more
- **Clinical**: Longitudinal clinical and neuropathological data
- **Tools**: Analysis pipelines, pre-computed results
- **Formats**: FASTQ, BAM, VCF, CSV, H5AD, AnnData
### Restrictions
- Data use conditions per informed consent of each study
- Must acknowledge AMP-AD and NIA
- Some datasets have embargo periods for new data
---
## 15. ADNI - BONUS ESSENTIAL DATASET
**Alzheimer's Disease Neuroimaging Initiative**
| Field | Details |
|-------|---------|
| **Subjects** | 2,000+ across ADNI-1, ADNI-2, ADNI-GO, ADNI-3, ADNI-4 |
| **Apply URL** | https://ida.loni.usc.edu/collaboration/access/appApply.jsp |
| **Website** | https://adni.loni.usc.edu/ |
### Application Requirements
- Review and agree to ADNI Data Use Agreement
- Application reviewed by Data Sharing and Publications Committee (DPC)
- Must be affiliated with a scientific or educational institution
- Describe proposed research or data use
### Approval Timeline
- **~2 weeks** for DPC review
### What You Get
- **MRI**: Structural, functional, DTI
- **PET**: Amyloid (AV45/PiB), Tau (AV1451), FDG
- **Biomarkers**: CSF (amyloid-beta, tau, p-tau), blood biomarkers
- **Genomics**: GWAS, WGS, WES
- **Clinical**: Longitudinal cognitive/clinical assessments
- **Formats**: DICOM, NIfTI, CSV
### Restrictions (IMPORTANT - 2025 UPDATE)
- **New AI restriction**: ADNI DUA now **explicitly forbids** use of external AI tools on the data
- AI tools restricted to within university/company (no external release allowed)
- Cannot share data with others
- Must acknowledge ADNI in all publications
- Must follow ADNI publication policies
---
## 16. Allen SEA-AD
**Seattle Alzheimer's Disease Brain Cell Atlas**
| Field | Details |
|-------|---------|
| **Subjects** | 84 donors spanning AD pathology spectrum |
| **Open Data** | https://portal.brain-map.org/explore/seattle-alzheimers-disease/seattle-alzheimers-disease-brain-cell-atlas-download |
| **AWS Registry** | https://registry.opendata.aws/allen-sea-ad-atlas/ |
| **Controlled Data** | Via AD Knowledge Portal (Synapse) |
### Application Requirements
- **Open/processed data**: No application needed -- freely downloadable
- **Raw sequencing data**: Apply through Synapse AD Knowledge Portal
### Approval Timeline
- **Open data**: Immediate
- **Controlled raw data**: 1-4 weeks via Synapse
### What You Get
- **snRNA-seq**: Single-nucleus RNA sequencing
- **snATAC-seq**: Single-nucleus chromatin accessibility
- **Multiome**: Combined RNA + ATAC from same nuclei
- **Neuropathology**: Quantitative pathology data
- **Spatial transcriptomics**: MERFISH data
- **Formats**: H5AD, AnnData, CSV, FASTQ (raw)
- **Already in your project**: You have SEA-AD metadata in `/data/allen_sea_ad/`
### Restrictions
- Must cite per Allen Institute Citation Policy
- Cite both primary publication and specific dataset
---
## 17. UK Biobank
| Field | Details |
|-------|---------|
| **Subjects** | 500,000+ (26,000+ with brain MRI) |
| **Apply URL** | https://www.ukbiobank.ac.uk/enable-your-research/register |
| **Website** | https://www.ukbiobank.ac.uk |
### Application Requirements
- Register as a researcher
- Submit research application describing study
- Institutional affiliation required
- Application fee: ~2,000-5,000 GBP (varies by data type)
- Ethics approval may be required
### Approval Timeline
- **Several weeks to months** for full approval
### What You Get
- **Brain MRI**: 26,000+ participants with structural/functional imaging
- **4,000+ imaging-derived phenotypes**: Pre-computed brain measures
- **Genetics**: Genome-wide genotyping, exome sequencing, WGS
- **Clinical**: GP records, hospital admissions, cognitive tests
- **Lifestyle**: Diet, exercise, socioeconomic data
- **Longitudinal**: Repeat visits over 15+ years
- **Formats**: Various (bulk data downloads)
### Restrictions
- Application fee required
- Strict data security requirements
- Must return results/findings
- UK-based ethics oversight
- Not AD-specific but massive AD-relevant subset
---
## 18. EEG/MEG Datasets for AD
### 18a. OpenNeuro EEG AD Dataset (ds004504)
| Field | Details |
|-------|---------|
| **Subjects** | 88 (36 AD, 23 FTD, 29 healthy) |
| **URL** | https://openneuro.org/datasets/ds004504 |
| **Access** | Free, immediate download |
| **Format** | BIDS-compliant EEG (EDF) |
| **License** | CC0 (public domain) |
| **Content** | Resting state EEG (eyes closed), raw + preprocessed |
### 18b. Complementary Photic Stimulation EEG Dataset (2025)
| Field | Details |
|-------|---------|
| **Subjects** | Same 88 participants as ds004504 |
| **Content** | Eyes-open photic stimulation recordings |
| **Format** | BIDS-compliant |
| **Published** | April 2025 |
### 18c. PEARL-Neuro Database
| Field | Details |
|-------|---------|
| **Subjects** | 192 middle-aged (50-63) at-risk participants |
| **URL** | https://openneuro.org/datasets/ds004796 |
| **Content** | EEG + fMRI + APOE/PICALM genetics + psychometric tests + blood tests |
| **Access** | Open access via OpenNeuro |
| **Format** | BIDS-compliant |
| **Unique value** | Multi-modal (EEG + fMRI) with genetic risk factors |
### 18d. LEAD Corpus (Research Reference)
| Field | Details |
|-------|---------|
| **Subjects** | 813 across 9 combined datasets (330 public, 483 private) |
| **Publication** | https://arxiv.org/html/2502.01678v1 |
| **Note** | World's largest EEG-AD corpus; public subset downloadable, private portion restricted |
---
## 19. Kaggle AD Datasets
### 19a. Alzheimer MRI 4-Class Dataset
| Field | Details |
|-------|---------|
| **URL** | https://www.kaggle.com/datasets/sachinkumar413/alzheimer-mri-dataset |
| **Classes** | Non-Demented, Very Mild, Mild, Moderate Demented |
| **Size** | ~6,400 images |
| **Format** | JPG/PNG |
| **Access** | Free, immediate download |
### 19b. Augmented Alzheimer MRI Dataset
| Field | Details |
|-------|---------|
| **URL** | https://www.kaggle.com/datasets/uraninjo/augmented-alzheimer-mri-dataset |
| **Content** | Augmented MRI slices for classification |
| **Access** | Free |
### 19c. Alzheimer's Disease Clinical Dataset
| Field | Details |
|-------|---------|
| **URL** | https://www.kaggle.com/datasets/rabieelkharoua/alzheimers-disease-dataset |
| **Content** | Clinical features, demographics, lifestyle, cognitive scores |
| **Access** | Free |
### 19d. OASIS-derived Kaggle Dataset
| Field | Details |
|-------|---------|
| **URL** | https://www.kaggle.com/datasets/jboysen/mri-and-alzheimers |
| **Content** | OASIS cross-sectional and longitudinal MRI data |
| **Access** | Free |
**Note**: Kaggle datasets are great for prototyping and model development but are NOT suitable for clinical validation or publications requiring primary data.
---
## 20. Additional & Recent Datasets (2025-2026)
### 20a. OASIS-4 (NEW)
- Latest release in the OASIS series
- MR, clinical, cognitive, and biomarker data for individuals with memory complaints
- Access: Same as OASIS-3 via sites.wustl.edu/oasisbrains
### 20b. CLARiTI (via NACC - NEW)
- URL: https://clariti.naccdata.org/for-researchers/access-data
- New collaborative data sharing initiative through NACC
- Focused on clinical trials integration
### 20c. Allen Brain Cell Atlas (Broader)
- URL: https://portal.brain-map.org/atlases-and-data/bkp/abc-atlas
- Broader brain cell atlas including AD-relevant cell types
- Python API: https://alleninstitute.github.io/abc_atlas_access/intro.html
### 20d. ADNI-4 (Latest Phase)
- URL: https://adni.loni.usc.edu/
- Newest ADNI phase with updated protocols
- Blood-based biomarker focus
- NOTE: New AI restrictions in DUA
### 20e. Bio-Hermes-002 (Upcoming)
- GAP Foundation + Alamar Biosciences collaboration announced January 2026
- Next-generation biomarker study building on Bio-Hermes-001
- Watch: https://globalalzplatform.org/
---
## SYNTHETIC / AUGMENTED AD DATA APPROACHES
For when real data is insufficient or for pre-training:
| Approach | Description | Reference |
|----------|-------------|-----------|
| **CycleGAN MRI augmentation** | Generate synthetic MRI scans; achieved 95% F1 (vs 89% without) | Frontiers in Medicine 2025 |
| **SMOTE for tabular data** | Oversample minority AD classes in clinical datasets | Multiple papers 2025 |
| **Diffusion model MRI generation** | Generate 2D slice projections of 3D MRI scans | ScienceDirect 2026 |
| **3D CNN with data augmentation** | Standard augmentation (flip, rotate, scale) on 3D MRI | arxiv 2505.04097 |
| **AdaBoost synthetic generation** | Boost training data diversity for clinical features | PMC 2025 |
---
## PRIORITY APPLICATION ORDER (Recommended)
Based on ease of access, data richness, and relevance to multi-modal AD research:
### Tier 1 -- Apply Immediately (Fast Approval, High Value)
1. **NACC** -- 48hr turnaround, 54K subjects, multi-modal
2. **AD Workbench / Bio-Hermes-001** -- Free account, 80K+ results, diverse cohort
3. **ANMerge** -- Open on Synapse, 1,702 subjects, multi-modal
4. **Allen SEA-AD** -- Open download, single-cell multi-omics (you already have metadata)
5. **OpenNeuro EEG datasets** -- Immediate free download, BIDS format
6. **Kaggle datasets** -- Immediate, good for prototyping
### Tier 2 -- Apply This Week (1-4 Week Approval)
7. **OASIS-3** -- NITRC registration, rich imaging data
8. **AMP-AD / Synapse** -- Free account, massive multi-omics
9. **ROSMAP** -- Via Synapse + RADC, deep longitudinal omics
10. **AIBL** -- Via LONI IDA, good imaging + lifestyle data
11. **ADNI** -- ~2 week review, gold standard (watch AI restrictions)
12. **MCSA** -- Via GAAIN/LONI, large clinical + imaging release
### Tier 3 -- Apply When Ready (Longer Approval, More Requirements)
13. **HCP-Aging** -- NDA DUC required, 22+ TB connectome data
14. **ADSP/NIAGADS** -- IRB required, massive genomics
15. **DIAN** -- Committee review, unique genetic AD data
16. **UK Biobank** -- Fee required, months for approval, massive scale
17. **PREVENT-AD registered** -- For pre-symptomatic biomarkers
18. **GAAIN** -- Federated queries across 500K subjects
---
## QUICK REFERENCE: ALL APPLICATION URLS
| Dataset | Application URL |
|---------|----------------|
| NACC | https://naccdata.org/requesting-data/nacc-data/ |
| OASIS-3 | https://www.nitrc.org/projects/oasis3/ |
| Bio-Hermes-001 | https://www.alzheimersdata.org/ad-workbench |
| ADSP/NIAGADS | https://dss.niagads.org/ |
| HCP-Aging | https://nda.nih.gov/ |
| PREVENT-AD (open) | https://openpreventad.loris.ca |
| PREVENT-AD (registered) | https://registeredpreventad.loris.ca |
| ANMerge | https://doi.org/10.7303/syn22252881 |
| ROSMAP | https://adknowledgeportal.synapse.org |
| MCSA | https://ida.loni.usc.edu/collaboration/access/appApply.jsp?project=MCSA |
| AIBL | https://ida.loni.usc.edu/collaboration/access/appApply.jsp?project=AIBL |
| DIAN | https://dian.wustl.edu/dian-observational-data-request-form/ |
| AD Workbench | https://discover.alzheimersdata.org |
| GAAIN | https://www.gaaindata.org |
| AMP-AD | https://adknowledgeportal.synapse.org |
| ADNI | https://ida.loni.usc.edu/collaboration/access/appApply.jsp |
| Allen SEA-AD | https://portal.brain-map.org/explore/seattle-alzheimers-disease |
| UK Biobank | https://www.ukbiobank.ac.uk/enable-your-research/register |
| OpenNeuro EEG | https://openneuro.org/datasets/ds004504 |
| PEARL-Neuro | https://openneuro.org/datasets/ds004796 |
---
## ESTIMATED TOTAL DATA AVAILABLE
| Category | Approximate Scale |
|----------|-------------------|
| **Total unique subjects across all datasets** | ~700,000+ |
| **Genomics subjects** | ~150,000+ (ADSP, ROSMAP, UK Biobank) |
| **Neuroimaging subjects** | ~50,000+ (ADNI, OASIS, HCP, NACC, AIBL, UK Biobank) |
| **Clinical/cognitive subjects** | ~600,000+ (NACC, UK Biobank, GAAIN) |
| **Single-cell omics** | ~84 donors, millions of cells (SEA-AD) |
| **EEG subjects** | ~1,000+ (OpenNeuro, PEARL-Neuro, LEAD corpus) |
| **Blood biomarkers** | ~80,000+ results (Bio-Hermes-001) |
| **Pre-symptomatic/at-risk** | ~2,000+ (PREVENT-AD, DIAN, HCP-Aging) |