light-curve
/

astrom3

+---
+tags:
+  - astronomy
+  - time-series
+  - light-curves
+  - variable-stars
+  - onnx
+library_name: onnx
+license: cc-by-4.0
+---
+# AstroM3 (photo encoder)
+## Paper
+Rizhko, M. et al. (2024). *AstroM³: A self-supervised multimodal model for astronomy*. arXiv:2411.08842.
+```bibtex
+@article{rizhko2024astrom3,
+  author = {Rizhko, Mariia and Bloom, Joshua S.},
+  title = {{AstroM³}: A self-supervised multimodal model for astronomy},
+  journal = {arXiv preprint arXiv:2411.08842},
+  year = {2024}
+}
+```
+## Original code
+<https://github.com/MeriDK/AstroM3> (git submodule at `models/astrom3/code/`)
+## License
+- **Code** (this repository): MIT — see [LICENSE](LICENSE).
+- **Model weights** (`AstroMLCore/AstroM3-CLIP-photo`): Creative Commons Attribution 4.0 (CC BY 4.0).
+## Model overview
+AstroM3 is a self-supervised multimodal contrastive model for variable-star classification that jointly trains photometry (light-curve), spectra, and metadata encoders using a CLIP-style objective. This integration exports the **photo-only encoder** from the pretrained CLIP checkpoint (`AstroMLCore/AstroM3-CLIP-photo`) as an ONNX embedding model.
+The photo encoder is an [Informer](https://ojs.aaai.org/index.php/AAAI/article/view/17325/17132) transformer (ProbSparse attention, 8 layers, d_model=128) trained on ZTF variable-star light curves from the MACC dataset. For ONNX export, the ProbSparse attention layers are replaced with standard scaled dot-product attention, which is equivalent in expectation and fully ONNX-exportable.
+## Inputs
+| Tensor | Shape | Description |
+|--------|-------|-------------|
+| `x_enc` | `[batch, 200, 9]` | Padded photometry features (9 channels per timestep — see preprocessing) |
+| `mask` | `[batch, 200]` | `1` for valid timesteps, `0` for padding |
+## Outputs (ONNX)
+Single file `astrom3.onnx` with two named outputs:
+| Output | Shape | Aggregation |
+|--------|-------|-------------|
+| `mean` | `[batch, 128]` | Masked mean pool of encoder outputs |
+| `sequence` | `[batch, 200, 128]` | Full per-timestep encoder outputs (unmasked) |
+## Preprocessing steps
+The 9 input channels per timestep are built by `preprocess_lc()` in the
+upstream dataset (`AstroMLCore/AstroM3Dataset`):
+| Index | Feature | How obtained |
+|-------|---------|--------------|
+| 0 | `time` (HJD scaled to [0, 1]) | per-observation |
+| 1 | `flux` = `(flux − mean) / MAD` | per-observation |
+| 2 | `flux_err` = `flux_err / MAD` | per-observation |
+| 3 | `amplitude` | **ASAS-SN catalog scalar, replicated to every timestep** |
+| 4 | `period` | **ASAS-SN catalog scalar, replicated** |
+| 5 | `lksl_statistic` (Lafler-Kinman string length) | **ASAS-SN catalog scalar, replicated** |
+| 6 | `rfr_score` (Random Forest Regressor R² for phase-folded LC) | **ASAS-SN catalog scalar, replicated** |
+| 7 | `log10(MAD_flux)` | global scalar computed from LC, replicated |
+| 8 | `delta_t` = `(max_HJD − min_HJD) / 365` | global scalar computed from LC, replicated |
+Features 3–6 come directly from the ASAS-SN v-band variable-star catalog
+(Jayasinghe et al. 2019) and are **not recomputed** from the light curve by
+this codebase. Users applying this model to non-ASAS-SN data must provide
+equivalent values (e.g. run a Lomb-Scargle period finder and compute
+peak-to-peak amplitude themselves).
+Preprocessing recipe for a single light curve:
+1. Deduplicate and sort observations by HJD.
+2. Compute `mean` and `MAD` of the flux column; normalize flux and flux_err.
+3. Scale HJD to [0, 1] over the span of the light curve.
+4. Compute `log10(MAD_flux)` and `delta_t = (max_HJD − min_HJD) / 365`.
+5. Obtain `amplitude`, `period`, `lksl_statistic`, `rfr_score` from the
+   ASAS-SN catalog (or compute equivalents).
+6. Tile the 6 global scalars across all timesteps; concatenate with columns
+   0–2 to produce an `(N, 9)` array.
+7. Pad or center-crop to 200 timesteps; set `mask = 0` for padded positions.
+8. Use `float32` for all tensors.
+## Weights
+Source: <https://huggingface.co/AstroMLCore/AstroM3-CLIP-photo>
+The `model.safetensors` file is a standalone Informer checkpoint (classification head present but unused; loaded with `strict=False`).
+Dataset: ASAS-SN v-band variable-star light curves (`AstroMLCore/AstroM3Processed`).
+## Applying the model without ASAS-SN catalog features
+Features 3–6 require the ASAS-SN catalog. For users applying the model to
+other surveys, we measured the sensitivity of the mean embedding to each
+feature being replaced. `rfr_score` was studied in detail.
+### rfr_score substitution
+`rfr_score` is the R² of a Random Forest Regressor fit to the phase-folded
+light curve; it quantifies period quality
+(Jayasinghe et al. 2019, MNRAS 486 1907, §5; arXiv:1809.07329).
+In the ASAS-SN test set it ranges from −3.5 to 1.18 (median ≈ 0.38).
+Setting all timesteps to the constant **0.392** (the empirical optimum,
+equal to the dataset median) minimises mean cosine distance from the
+true-feature embeddings:
+| Metric | Value |
+|--------|-------|
+| Overall mean cosine distance | 0.049 ± 0.091 |
+| Macro-average per class | 0.049 ± 0.058 |
+Per-class breakdown (5 samples per class from the ASAS-SN test split):
+| Class | Mean dist | Std | True rfr mean |
+|-------|-----------|-----|---------------|
+| EW    | 0.005 | 0.005 | −0.07 |
+| SR    | 0.004 | 0.003 | +0.50 |
+| EA    | 0.060 | 0.032 | +0.95 |
+| RRAB  | 0.020 | 0.011 | +0.83 |
+| EB    | 0.016 | 0.011 | +0.90 |
+| ROT   | 0.002 | 0.002 | +0.85 |
+| RRC   | 0.147 | 0.115 | −0.79 |
+| HADS  | 0.016 | 0.011 | +0.59 |
+| M     | 0.050 | 0.020 | +0.18 |
+| DSCT  | 0.170 | 0.182 | −0.86 |
+Classes whose true rfr mean is far from 0.39 (RRC, DSCT) are most affected.
+Using an out-of-range value (e.g. ±100) causes cosine distances ~0.93–0.97,
+so staying within the training distribution is important.