nanocatalyst (depth=8, 25.2M params)
Minimal JAX/Flax transformer for catalyst structure generation with single-parameter depth scaling.
Model Details
| Architecture | Transformer (RMSNorm, RoPE, QK-norm, ReLU², logit softcapping, residual scalars) |
| Parameters | 25.2M |
| Depth | 8 (n_embd=512, n_layer=8, n_head=8) |
| Vocab size | 186 (WordLevel, 2-digit pair encoding) |
| Training data | 174K OC20 structures |
| Training time | 97 min on TPU v6e-8 |
| Framework | JAX / Flax |
Results (CuPt3 + OH, T=0.8, top_k=40, 100 samples)
| Metric | Result |
|---|---|
| Parseable | 96/100 |
| Element Match | 96/100 |
| Generation Validity | 96/100 (96.0%) |
| Uniqueness | 96/96 (100.0%) |
| Novelty | 96/96 (100.0%) |
| Min Distance (≥ 0.5Å) | 83/96 (86.5%) |
Usage
from catalyst.hub import download_checkpoint
from catalyst.config import CatalystConfig
from catalyst.generate import generate_samples
# Download checkpoint
ckpt_path = download_checkpoint("everythingchalna/nanocatalyst")
config = CatalystConfig.load(ckpt_path / "config.json")
# Load params and generate (see README for full example)
Training
Trained on 174K structures from the OC20 S2EF dataset using a TPU v6e-8 (Google TRC program). 20 epochs, WSD learning rate schedule, AdamW optimizer. Final val_loss=0.9518.
Files
config.json— Model configurationparams/— Orbax checkpoint (model parameters)tokenizer.json— HuggingFace WordLevel tokenizertokenizer_stats.json— Tokenizer coverage statistics
Citation
@software{nanocatalyst,
title = {nanocatalyst},
url = {https://github.com/everythingchalna/nanocatalyst},
license = {MIT}
}
Acknowledgments
Training compute provided by the Google TPU Research Cloud (TRC) program.
- Downloads last month
- 53
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support