File size: 2,747 Bytes
e21a093
 
 
 
ce4bd59
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e21a093
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
---
tags:
- ml-intern
---
# Symbolic Regression for Wind Speed Forecasting (EQL)

This repository contains a **TensorFlow 2.x** reproduction of the paper:

> **Symbolic regression for scientific discovery: an application to wind speed forecasting**  
> Ismail Alaoui Abdellaoui, Siamak Mehrkanoon  
> arXiv:2102.10570

## What was reproduced

- Full **Equation Learner (EQL)** architecture with the original activation set:
  - `Constant`, `Identity`, `Square`, `Sin`, `Sigmoid`, `Product`
- **Two-phase training**:
  1. Phase 1: Sparse-inducing `L_{0.5}` smooth regularization + rescaled MSE
  2. Phase 2: Masked fine-tuning (freeze zero weights, re-optimize)
- Denmark hourly weather dataset (5 cities, 4 features, 4 lags, 6-hour ahead prediction)
- Pretty-printing of discovered analytical formulas via SymPy

## Repository structure

| File | Description |
|------|-------------|
| `reproduce_eql.py` | Core library: network, functions, pretty-print, regularization, utils |
| `run_full_experiment.py` | End-to-end training script with paper hyperparameters |
| `prepare_denmark_data.py` | Generates `.mat` inputs from raw weather CSV |
| `requirements.txt` | Dependencies |

## Dataset

The Denmark weather data (hourly, 1980–2018) is available as a Hugging Face dataset:

🔗 https://huggingface.co/datasets/Mengqinxue/eql-wind-speed-denmark

## Quick start

```bash
pip install -r requirements.txt
python run_full_experiment.py --city Roskilde --steps_ahead 6 --feature wind_speed
```

Supported cities: `Esbjerg`, `Odense`, `Roskilde`.

## Results

The script produces:
- `ExperimentsSR/Experiment*/` — training logs, plots, weight histograms
- `summary_experiment*.txt` — final MAE, MSE, extracted formula
- `.hdf5` weight checkpoints for Phase 1 and Phase 2

## Citation

```bibtex
@article{abdellaoui2021symbolic,
  title={Symbolic regression for scientific discovery: an application to wind speed forecasting},
  author={Abdellaoui, Ismail Alaoui and Mehrkanoon, Siamak},
  journal={arXiv preprint arXiv:2102.10570},
  year={2021}
}
```

<!-- ml-intern-provenance -->
## Generated by ML Intern

This model repository was generated by [ML Intern](https://github.com/huggingface/ml-intern), an agent for machine learning research and development on the Hugging Face Hub.

- Try ML Intern: https://smolagents-ml-intern.hf.space
- Source code: https://github.com/huggingface/ml-intern

## Usage

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'Mengqinxue/eql-wind-speed-forecasting'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)
```

For non-causal architectures, replace `AutoModelForCausalLM` with the appropriate `AutoModel` class.