Upload README.md with huggingface_hub
Browse files
README.md
ADDED
|
@@ -0,0 +1,125 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- astronomy
|
| 5 |
+
- time-series
|
| 6 |
+
- light-curves
|
| 7 |
+
- onnx
|
| 8 |
+
library_name: onnx
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# Astromer 2
|
| 12 |
+
|
| 13 |
+
## Paper
|
| 14 |
+
|
| 15 |
+
Donoso-Oliva, C., Becker, I., Protopapas, P., Cabrera-Vives, G., CΓ‘diz-Leyton, M., & Moreno-Cartagena, D. (2026). *Generalizing across astronomical surveys: Few-shot light curve classification with Astromer 2*. Astronomy & Astrophysics (in press).
|
| 16 |
+
|
| 17 |
+
```bibtex
|
| 18 |
+
@article{astromer2,
|
| 19 |
+
author = {Donoso-Oliva, C. and Becker, I. and Protopapas, P. and
|
| 20 |
+
Cabrera-Vives, G. and C{\'a}diz-Leyton, M. and Moreno-Cartagena, D.},
|
| 21 |
+
title = {Generalizing across astronomical surveys: Few-shot light curve
|
| 22 |
+
classification with {Astromer} 2},
|
| 23 |
+
journal = {Astronomy \& Astrophysics},
|
| 24 |
+
year = {2026},
|
| 25 |
+
note = {In press},
|
| 26 |
+
}
|
| 27 |
+
```
|
| 28 |
+
|
| 29 |
+
## Original code
|
| 30 |
+
|
| 31 |
+
<https://github.com/astromer-science/main-code> (git submodule at `models/astromer2/code/`)
|
| 32 |
+
|
| 33 |
+
## License
|
| 34 |
+
|
| 35 |
+
MIT β see [LICENSE](LICENSE).
|
| 36 |
+
|
| 37 |
+
## Model overview
|
| 38 |
+
|
| 39 |
+
Astromer 2 is a BERT-inspired transformer encoder pretrained on 1.5 million MACHO light curves via masked magnitude prediction. The encoder processes irregularly-sampled photometric time series (time, magnitude) using MJD-aware positional encoding and a trainable mask token. It produces per-timestep contextual embeddings that can be aggregated into a fixed-size representation for downstream tasks such as few-shot classification.
|
| 40 |
+
|
| 41 |
+
Default configuration: 6 attention blocks, 4 heads, head dimension 64 (d_model = 256), sequence length 200, embedding dimension 256.
|
| 42 |
+
|
| 43 |
+
## Input data format
|
| 44 |
+
|
| 45 |
+
The model was pretrained on MACHO survey photometry. MACHO light curves consist of triples `(mjd, mag, err)` where:
|
| 46 |
+
- `mjd` β Modified Julian Date of each observation (~48800β51700 for MACHO)
|
| 47 |
+
- `mag` β MACHO instrumental magnitude (typically negative values, e.g. β10 to β3 in the MACHO system)
|
| 48 |
+
- `err` β photometric error; some observations carry large negative sentinel values (e.g. β3000, β9000) indicating bad data β **these are passed through the pipeline as-is without filtering**
|
| 49 |
+
|
| 50 |
+
## Preprocessing steps
|
| 51 |
+
|
| 52 |
+
All steps are implemented in `code/src/data/loaders.py` (`get_loader`) and `code/src/data/preprocessing.py`.
|
| 53 |
+
|
| 54 |
+
### Step 1 β Windowing
|
| 55 |
+
|
| 56 |
+
If the light curve has more than 200 observations, take the first 200 (non-random, sequential window). If it has fewer than 200, use all observations and pad in step 3.
|
| 57 |
+
|
| 58 |
+
Source: `src/data/preprocessing.py:to_windows` with `sampling=False`.
|
| 59 |
+
|
| 60 |
+
### Step 2 β Zero-mean normalization
|
| 61 |
+
|
| 62 |
+
Subtract the per-light-curve column mean from **all three columns** (time, magnitude, error):
|
| 63 |
+
|
| 64 |
+
```
|
| 65 |
+
x_norm = x - mean(x, axis=0) # x has shape [n_obs, 3]
|
| 66 |
+
```
|
| 67 |
+
|
| 68 |
+
After this step, `times` and `input` (magnitudes) are centred around zero. The error column is also normalised but is discarded before the encoder (see step 4).
|
| 69 |
+
|
| 70 |
+
Source: `src/data/preprocessing.py:standardize`.
|
| 71 |
+
|
| 72 |
+
### Step 3 β Padding and mask construction
|
| 73 |
+
|
| 74 |
+
Right-pad the normalised sequence to exactly 200 time steps with zeros. Construct `mask_in`:
|
| 75 |
+
|
| 76 |
+
```
|
| 77 |
+
mask_in[i] = 0 for i < n_obs (real observation β visible to encoder)
|
| 78 |
+
mask_in[i] = 1 for i >= n_obs (padding β hidden from encoder)
|
| 79 |
+
```
|
| 80 |
+
|
| 81 |
+
> **Note on mask convention:** the internal pipeline uses `mask_in=0` for visible positions and `mask_in=1` for padding/hidden positions. This is the opposite of the ONNX interface (see below).
|
| 82 |
+
|
| 83 |
+
Source: `src/data/masking.py:mask_sample`, padding block at the end.
|
| 84 |
+
|
| 85 |
+
### Step 4 β Format encoder inputs
|
| 86 |
+
|
| 87 |
+
Extract the two encoder inputs from the normalised, padded array:
|
| 88 |
+
|
| 89 |
+
| Tensor | Source | Shape |
|
| 90 |
+
|--------|--------|-------|
|
| 91 |
+
| `input` | normalised magnitude column | `[batch, 200, 1]` |
|
| 92 |
+
| `times` | normalised time column | `[batch, 200, 1]` |
|
| 93 |
+
| `mask_in` | constructed in step 3 | `[batch, 200, 1]` |
|
| 94 |
+
|
| 95 |
+
The normalised error column is **not** fed to the encoder. Errors appear only in the pretraining reconstruction loss.
|
| 96 |
+
|
| 97 |
+
Source: `src/data/loaders.py:format_inp_astromer` (`aversion='base'`).
|
| 98 |
+
|
| 99 |
+
## Inputs (ONNX)
|
| 100 |
+
|
| 101 |
+
The exported ONNX models use a **user-friendly mask convention** that is the inverse of the internal pipeline:
|
| 102 |
+
|
| 103 |
+
| Tensor | Shape | Description |
|
| 104 |
+
|--------|-------|-------------|
|
| 105 |
+
| `input` | `[batch, 200, 1]` | Zero-mean normalised magnitudes (step 2 above) |
|
| 106 |
+
| `times` | `[batch, 200, 1]` | Zero-mean normalised times (step 2 above) |
|
| 107 |
+
| `mask_in` | `[batch, 200, 1]` | **1 = valid observation, 0 = padding** |
|
| 108 |
+
|
| 109 |
+
The ONNX wrapper inverts `mask_in` internally before passing it to the encoder, so consumers can use the intuitive convention.
|
| 110 |
+
|
| 111 |
+
## Outputs (ONNX)
|
| 112 |
+
|
| 113 |
+
| File | Output shape | Aggregation |
|
| 114 |
+
|------|-------------|-------------|
|
| 115 |
+
| `astromer2_mean.onnx` | `[batch, 256]` | Masked mean pooling: `sum(z * mask_in) / sum(mask_in)` |
|
| 116 |
+
| `astromer2_max.onnx` | `[batch, 256]` | Masked max pooling over valid timesteps |
|
| 117 |
+
| `astromer2_full.onnx` | `[batch, 200, 256]` | Full per-timestep sequence; consumer aggregates |
|
| 118 |
+
|
| 119 |
+
ONNX opset: 13.
|
| 120 |
+
|
| 121 |
+
## Weights
|
| 122 |
+
|
| 123 |
+
Source: [Zenodo record 18207945](https://zenodo.org/records/18207945)
|
| 124 |
+
Training dataset: MACHO (1.5 million light curves, V and R bands)
|
| 125 |
+
Checkpoint: `astromer_v2/macho/`
|