Astromer 1

Paper

Donoso-Oliva, C., Becker, I., Protopapas, P., Cabrera-Vives, G., Forster, F., & Estévez, P. A. (2023). ASTROMER: A transformer-based embedding for the representation of light curves. Astronomy & Astrophysics, 670, A54.

@article{astromer1,
  author  = {Donoso-Oliva, C. and Becker, I. and Protopapas, P. and
             Cabrera-Vives, G. and Forster, F. and Est{\'e}vez, P. A.},
  title   = {{ASTROMER}: A transformer-based embedding for the representation
             of light curves},
  journal = {Astronomy \& Astrophysics},
  volume  = {670},
  pages   = {A54},
  year    = {2023},
  doi     = {10.1051/0004-6361/202243928},
}

Original code

https://github.com/astromer-science/main-code (Astromer v1 tag)

License

MIT — see LICENSE.

Model overview

Astromer 1 is a transformer encoder pretrained on MACHO R-band light curves via masked magnitude prediction. It maps irregularly-sampled photometric time series to per-timestep contextual embeddings using an MJD-aware sinusoidal positional encoding. The architecture uses 2 transformer layers, 4 attention heads, and a head dimension of 64, producing 256-dimensional embeddings.

Inputs

All tensors are float32. Both magnitudes and times are zero-mean normalized before passing to the model (subtract the per-window mean of each).

Tensor	Shape	Description
`input`	`[batch, 200, 1]`	`mag − mean(mag)` over the window
`times`	`[batch, 200, 1]`	`time − mean(time)` over the window
`mask_in`	`[batch, 200, 1]`	1 = valid observation, 0 = padded position

Outputs (ONNX)

File	Output shape	Aggregation
`astromer1_mean.onnx`	`[batch, 256]`	Masked mean pooling over valid timesteps
`astromer1_max.onnx`	`[batch, 256]`	Masked max pooling over valid timesteps
`astromer1_full.onnx`	`[batch, 200, 256]`	Full per-timestep sequence

ONNX opset: 13.

Preprocessing steps

Photometric errors are not used at inference — only time and magnitude are needed. The upstream code internally expects a 3-column [time, mag, err] array, but the error column is dead code in the encoder (extracted but never used). Pass dummy zeros if running the pipeline directly.

Collect observation times (in days — need not be absolute MJD) and magnitudes.
Truncate each light curve to at most 200 observations (take the first 200 if longer).
Zero-mean normalize both columns over the window: time -= time.mean(), mag -= mag.mean()
Pad shorter light curves to exactly 200 positions: append zeros to both input and times.
Build the mask: set mask_in = 1 for real observations, mask_in = 0 for padded positions.
Reshape each tensor to [batch, 200, 1] (add trailing dimension).

The sequence length is fixed at 200 by the pretrained weights.

Weights

Source: Zenodo record 18207945 Training dataset: MACHO R-band light curves Checkpoint: pt_macho_v1_2021.zip

The test-data parquet file was generated with these MACHO weights using truncation to the first 200 observations.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support