File size: 5,164 Bytes
8a10305
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
198ddf9
8a10305
 
 
198ddf9
 
 
 
 
 
 
8a10305
198ddf9
 
 
 
8a10305
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4ae763e
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
---
license: bsd-3-clause
language:
- en
tags:
- pytorch
- materials-science
- crystallography
- x-ray-diffraction
- pxrd
- convnext
- arxiv:2603.23367
datasets:
- materials-project
metrics:
- accuracy
- mae
pipeline_tag: other
---

# AlphaDiffract β€” Open Weights

[arXiv](https://arxiv.org/abs/2603.23367) | [GitHub](https://github.com/AdvancedPhotonSource/AlphaDiffract)

**Automated crystallographic analysis of powder X-ray diffraction data.**

AlphaDiffract is a multi-task 1D ConvNeXt model that takes a powder X-ray diffraction (PXRD) pattern and simultaneously predicts:

| Output | Description |
|---|---|
| **Crystal system** | 7-class classification (Triclinic β†’ Cubic) |
| **Space group** | 230-class classification |
| **Lattice parameters** | 6 values: a, b, c (Γ…), Ξ±, Ξ², Ξ³ (Β°) |

This release contains a **single model** trained exclusively on
[Materials Project](https://next-gen.materialsproject.org/) structures
(publicly available data). It is *not* the 10-model ensemble reported in
the paper β€” see [Performance](#performance) for details.

## Quick Start

```bash
pip install torch safetensors numpy huggingface-hub
```

```python
import sys
import torch
import numpy as np
from huggingface_hub import snapshot_download

# Download model files
model_dir = snapshot_download("linked-liszt/OpenAlphaDiffract")

# Load model
sys.path.insert(0, model_dir)
from model import AlphaDiffract
model = AlphaDiffract.from_pretrained(model_dir, device="cpu")

# 8192-point intensity pattern, normalized to [0, 100]
pattern = np.load("my_pattern.npy").astype(np.float32)
pattern = (pattern - pattern.min()) / (pattern.max() - pattern.min() + 1e-10) * 100.0
x = torch.from_numpy(pattern).unsqueeze(0)

with torch.no_grad():
    out = model(x)

cs_probs = torch.softmax(out["cs_logits"], dim=-1)
sg_probs = torch.softmax(out["sg_logits"], dim=-1)
lp = out["lp"]  # [a, b, c, alpha, beta, gamma]

print("Crystal system:", AlphaDiffract.CRYSTAL_SYSTEMS[cs_probs.argmax().item()])
print("Space group:   #", sg_probs.argmax().item() + 1)
print("Lattice params:", lp[0].tolist())
```

See `example_inference.py` for a complete runnable example.

## Files

| File | Description |
|---|---|
| `model.safetensors` | Model weights (safetensors format, ~35 MB) |
| `model.py` | Standalone model definition (pure PyTorch, no Lightning) |
| `config.json` | Architecture and training hyperparameters |
| `maxsub.json` | Space-group subgroup graph (230Γ—230, used as a registered buffer) |
| `example_inference.py` | End-to-end inference example |
| `LICENSE` | BSD 3-Clause |


## Input Format

- **Length:** 8192 equally-spaced intensity values
- **2ΞΈ range:** 5–20Β° (monochromatic, 20 keV)
- **Preprocessing:** floor negatives at zero, then rescale to [0, 100]
- **Shape:** `(batch, 8192)` or `(batch, 1, 8192)`

## Architecture

1D ConvNeXt backbone adapted from [Liu et al. (2022)](https://arxiv.org/abs/2201.03545):

```
Input (8192) β†’ [ConvNeXt Block Γ— 3 with AvgPool] β†’ Flatten (560-d)
  β”œβ”€ CS head:  MLP 560β†’2300β†’1150β†’7    (crystal system)
  β”œβ”€ SG head:  MLP 560β†’2300β†’1150β†’230  (space group)
  └─ LP head:  MLP 560β†’512β†’256β†’6      (lattice parameters, sigmoid-bounded)
```

- **Parameters:** 8,734,989
- **Activation:** GELU
- **Stochastic depth:** 0.3
- **Head dropout:** 0.5

## Performance

This is a **single model** trained on Materials Project data only (no ICSD).
Metrics on the best validation checkpoint (epoch 11):

| Metric | Simulated Val | RRUFF (experimental) |
|---|---|---|
| Crystal system accuracy | 74.88% | 60.35% |
| Space group accuracy | 57.31% | 38.28% |
| Lattice parameter MAE | 2.71 | β€” |

The paper reports higher numbers from a 10-model ensemble trained on
Materials Project + ICSD combined data. This open-weights release covers
only publicly available training data.

## Training Details

| | |
|---|---|
| **Data** | ~146k Materials Project structures, 100 GSAS-II simulations each |
| **Augmentation** | Poisson + Gaussian noise, rescaled to [0, 100] |
| **Optimizer** | AdamW (lr=2e-4, weight_decay=0.01) |
| **Scheduler** | CyclicLR (triangular2, 6-epoch half-cycles) |
| **Loss** | CE (crystal system) + CE + GEMD (space group) + MSE (lattice params) |
| **Hardware** | 1Γ— NVIDIA H100, float32 |
| **Batch size** | 64 |

## Citation

```bibtex
@article{andrejevic2026alphadiffract,
  title   = {AlphaDiffract: Automated Crystallographic Analysis of Powder X-ray Diffraction Data},
  author  = {Andrejevic, Nina and Du, Ming and Sharma, Hemant and Horwath, James P. and Luo, Aileen and Yin, Xiangyu and Prince, Michael and Toby, Brian H. and Cherukara, Mathew J.},
  year    = {2026},
  eprint  = {2603.23367},
  archivePrefix = {arXiv},
  primaryClass  = {cond-mat.mtrl-sci},
  doi     = {10.48550/arXiv.2603.23367},
  url     = {https://arxiv.org/abs/2603.23367}
}
```

## License

BSD 3-Clause β€” Copyright 2026 UChicago Argonne, LLC.

## Links

- [arXiv: 2603.23367](https://arxiv.org/abs/2603.23367)
- [GitHub: OpenAlphaDiffract](https://github.com/AdvancedPhotonSource/OpenAlphaDiffract)
- [GitHub: AlphaDiffract](https://github.com/AdvancedPhotonSource/AlphaDiffract)