Humanlearning commited on
Commit
0e95d4f
·
1 Parent(s): f3080d1

feat: integrate Trackio for experiment tracking, add GRPO training support, and deploy web-based monitoring tools

Browse files
.agents/skills/openenv-cli/SKILL.md ADDED
@@ -0,0 +1,18 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: openenv-cli
3
+ description: "OpenEnv CLI (`openenv`) for scaffolding, validating, building, and pushing OpenEnv environments."
4
+ ---
5
+
6
+ Install: `pip install openenv-core`
7
+
8
+ The OpenEnv CLI command `openenv` is available.
9
+ Use `openenv --help` to view available commands.
10
+
11
+ Generated with `openenv-core v0.2.3`. Run `openenv skills add --force` to regenerate.
12
+
13
+ ## Tips
14
+
15
+ - Start with `openenv init <env_name>` to scaffold a new environment
16
+ - Validate projects with `openenv validate`
17
+ - Build and deploy with `openenv build` and `openenv push`
18
+ - Use `openenv <command> --help` for command-specific options
.agents/skills/trackio/SKILL.md ADDED
@@ -0,0 +1,124 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ name: hugging-face-trackio
3
+ description: Track and visualize ML training experiments with Trackio. Use when logging metrics during training (Python API), firing alerts for training diagnostics, or retrieving/analyzing logged metrics (CLI). Supports real-time dashboard visualization, alerts with webhooks, HF Space syncing, and JSON output for automation.
4
+ ---
5
+
6
+ # Trackio - Experiment Tracking for ML Training
7
+
8
+ Trackio is an experiment tracking library for logging and visualizing ML training metrics. It syncs to Hugging Face Spaces for real-time monitoring dashboards.
9
+
10
+ ## Three Interfaces
11
+
12
+ | Task | Interface | Reference |
13
+ |------|-----------|-----------|
14
+ | **Logging metrics** during training | Python API | [logging_metrics.md](logging_metrics.md) |
15
+ | **Firing alerts** for training diagnostics | Python API | [alerts.md](alerts.md) |
16
+ | **Retrieving metrics & alerts** after/during training | CLI | [retrieving_metrics.md](retrieving_metrics.md) |
17
+ | **Inspecting storage schema and running direct SQL** | CLI | [storage_schema.md](storage_schema.md) |
18
+
19
+ ## When to Use Each
20
+
21
+ ### Python API → Logging
22
+
23
+ Use `import trackio` in your training scripts to log metrics:
24
+
25
+ - Initialize tracking with `trackio.init()`
26
+ - Log metrics with `trackio.log()` or use TRL's `report_to="trackio"`
27
+ - Finalize with `trackio.finish()`
28
+
29
+ **Key concept**: For remote/cloud training, pass `space_id` — metrics sync to a Space dashboard so they persist after the instance terminates.
30
+
31
+ → See [logging_metrics.md](logging_metrics.md) for setup, TRL integration, and configuration options.
32
+
33
+ ### Python API → Alerts
34
+
35
+ Insert `trackio.alert()` calls in training code to flag important events — like inserting print statements for debugging, but structured and queryable:
36
+
37
+ - `trackio.alert(title="...", level=trackio.AlertLevel.WARN)` — fire an alert
38
+ - Three severity levels: `INFO`, `WARN`, `ERROR`
39
+ - Alerts are printed to terminal, stored in the database, shown in the dashboard, and optionally sent to webhooks (Slack/Discord)
40
+
41
+ **Key concept for LLM agents**: Alerts are the primary mechanism for autonomous experiment iteration. An agent should insert alerts into training code for diagnostic conditions (loss spikes, NaN gradients, low accuracy, training stalls). Since alerts are printed to the terminal, an agent that is watching the training script's output will see them automatically. For background or detached runs, the agent can poll via CLI instead.
42
+
43
+ → See [alerts.md](alerts.md) for the full alerts API, webhook setup, and autonomous agent workflows.
44
+
45
+ ### CLI → Retrieving
46
+
47
+ Use the `trackio` command to query logged metrics and alerts:
48
+
49
+ - `trackio list projects/runs/metrics` — discover what's available
50
+ - `trackio get project/run/metric` — retrieve summaries and values
51
+ - `trackio query project --project <name> --sql "SELECT ..."` — run catch-all read-only SQL
52
+ - `trackio list alerts --project <name> --json` — retrieve alerts
53
+ - `trackio show` — launch the dashboard
54
+ - `trackio sync` — sync to HF Space
55
+
56
+ **Key concept**: Add `--json` for programmatic output suitable for automation and LLM agents.
57
+
58
+ **Remote Spaces**: Add `--space <space_id_or_url>` to any `list`/`get`/`query` command to query a remote HF Space instead of local data. Use `--hf-token` for private Spaces.
59
+
60
+ → See [retrieving_metrics.md](retrieving_metrics.md) for all commands, workflows, and JSON output formats.
61
+ → See [storage_schema.md](storage_schema.md) for SQLite tables, parquet layout, and direct query examples.
62
+
63
+ ## Minimal Logging Setup
64
+
65
+ ```python
66
+ import trackio
67
+
68
+ trackio.init(project="my-project", space_id="username/trackio")
69
+ trackio.log({"loss": 0.1, "accuracy": 0.9})
70
+ trackio.log({"loss": 0.09, "accuracy": 0.91})
71
+ trackio.finish()
72
+ ```
73
+
74
+ ### Minimal Retrieval
75
+
76
+ ```bash
77
+ trackio list projects --json
78
+ trackio get metric --project my-project --run my-run --metric loss --json
79
+ trackio query project --project my-project --sql "SELECT name FROM sqlite_master WHERE type = 'table'" --json
80
+
81
+ # Query a remote Space
82
+ trackio list projects --space username/my-space --json
83
+ ```
84
+
85
+ ## Autonomous ML Experiment Workflow
86
+
87
+ When running experiments autonomously as an LLM agent, the recommended workflow is:
88
+
89
+ 1. **Set up training with alerts** — insert `trackio.alert()` calls for diagnostic conditions
90
+ 2. **Launch training** — run the script in the background
91
+ 3. **Poll for alerts** — use `trackio list alerts --project <name> --json --since <timestamp>` to check for new alerts
92
+ 4. **Read metrics** — use `trackio get metric ...` to inspect specific values
93
+ 5. **Iterate** — based on alerts and metrics, stop the run, adjust hyperparameters, and launch a new run
94
+
95
+ ```python
96
+ import trackio
97
+
98
+ trackio.init(project="my-project", config={"lr": 1e-4})
99
+
100
+ for step in range(num_steps):
101
+ loss = train_step()
102
+ trackio.log({"loss": loss, "step": step})
103
+
104
+ if step > 100 and loss > 5.0:
105
+ trackio.alert(
106
+ title="Loss divergence",
107
+ text=f"Loss {loss:.4f} still high after {step} steps",
108
+ level=trackio.AlertLevel.ERROR,
109
+ )
110
+ if step > 0 and abs(loss) < 1e-8:
111
+ trackio.alert(
112
+ title="Vanishing loss",
113
+ text="Loss near zero — possible gradient collapse",
114
+ level=trackio.AlertLevel.WARN,
115
+ )
116
+
117
+ trackio.finish()
118
+ ```
119
+
120
+ Then poll from a separate terminal/process:
121
+
122
+ ```bash
123
+ trackio list alerts --project my-project --json --since "2025-01-01T00:00:00"
124
+ ```
.agents/skills/trackio/alerts.md ADDED
@@ -0,0 +1,199 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Trackio Alerts
2
+
3
+ Alerts let you flag important training events directly from code. They are the primary mechanism for LLM agents to diagnose runs and iterate autonomously on ML experiments.
4
+
5
+ Alerts are printed to the terminal, stored in the database, displayed in the dashboard, and optionally sent to webhooks (Slack/Discord).
6
+
7
+ <img width="2972" height="1694" alt="image" src="https://github.com/user-attachments/assets/02d938f8-51a9-4706-85c4-d95b7645bcf4" />
8
+
9
+
10
+ ## Core API
11
+
12
+ ### trackio.alert()
13
+
14
+ ```python
15
+ trackio.alert(
16
+ title="Loss divergence", # Short title (required)
17
+ text="Loss 5.2 still high after 200 steps", # Detailed description (optional)
18
+ level=trackio.AlertLevel.WARN, # INFO, WARN, or ERROR (default: WARN)
19
+ webhook_url="https://hooks.slack.com/...", # Per-alert webhook override (optional)
20
+ )
21
+ ```
22
+
23
+ ### Alert Levels
24
+
25
+ | Level | Usage |
26
+ |-------|-------|
27
+ | `trackio.AlertLevel.INFO` | Informational milestones (checkpoints saved, eval completed) |
28
+ | `trackio.AlertLevel.WARN` | Potential issues (loss plateau, low accuracy, high gradient norm) |
29
+ | `trackio.AlertLevel.ERROR` | Critical failures (NaN loss, divergence, OOM) |
30
+
31
+ ### Webhook Support
32
+
33
+ Set a global webhook URL via `trackio.init()` or the `TRACKIO_WEBHOOK_URL` environment variable. Alerts are auto-formatted for Slack and Discord URLs.
34
+
35
+ ```python
36
+ trackio.init(
37
+ project="my-project",
38
+ webhook_url="https://hooks.slack.com/services/...",
39
+ webhook_min_level=trackio.AlertLevel.WARN, # Only send WARN+ to webhook
40
+ )
41
+ ```
42
+
43
+ Per-alert override:
44
+
45
+ ```python
46
+ trackio.alert(
47
+ title="Critical failure",
48
+ level=trackio.AlertLevel.ERROR,
49
+ webhook_url="https://hooks.slack.com/services/...", # Overrides global URL
50
+ )
51
+ ```
52
+
53
+ Environment variables:
54
+ - `TRACKIO_WEBHOOK_URL` — global webhook URL
55
+ - `TRACKIO_WEBHOOK_MIN_LEVEL` — minimum level for webhook delivery (`info`, `warn`, `error`)
56
+
57
+ ## Retrieving Alerts (CLI)
58
+
59
+ ```bash
60
+ # List all alerts for a project
61
+ trackio list alerts --project my-project --json
62
+
63
+ # Filter by run or level
64
+ trackio list alerts --project my-project --run my-run --level error --json
65
+
66
+ # Poll for new alerts since a timestamp (efficient for agents)
67
+ trackio list alerts --project my-project --json --since "2025-06-01T12:00:00"
68
+ ```
69
+
70
+ ### JSON Output Structure
71
+
72
+ ```json
73
+ {
74
+ "project": "my-project",
75
+ "run": null,
76
+ "level": null,
77
+ "since": "2025-06-01T12:00:00",
78
+ "alerts": [
79
+ {
80
+ "run": "run-name",
81
+ "title": "Loss divergence",
82
+ "text": "Loss 5.2 still high after 200 steps",
83
+ "level": "warn",
84
+ "step": 200,
85
+ "timestamp": "2025-06-01T12:05:30"
86
+ }
87
+ ]
88
+ }
89
+ ```
90
+
91
+ ## Autonomous Agent Workflow
92
+
93
+ The recommended pattern for an LLM agent running ML experiments:
94
+
95
+ ### 1. Insert Alerts Into Training Code
96
+
97
+ Add diagnostic `trackio.alert()` calls for conditions the agent should react to:
98
+
99
+ ```python
100
+ import trackio
101
+
102
+ trackio.init(project="hyperparam-sweep", config={"lr": lr, "batch_size": bs})
103
+
104
+ for step in range(num_steps):
105
+ loss = train_step()
106
+ trackio.log({"loss": loss, "step": step})
107
+
108
+ if step > 200 and loss > 5.0:
109
+ trackio.alert(
110
+ title="Loss divergence",
111
+ text=f"Loss {loss:.4f} still above 5.0 after {step} steps — learning rate may be too high",
112
+ level=trackio.AlertLevel.ERROR,
113
+ )
114
+
115
+ if step > 500 and loss_delta < 0.001:
116
+ trackio.alert(
117
+ title="Training stall",
118
+ text=f"Loss barely changed over last 100 steps (delta={loss_delta:.6f})",
119
+ level=trackio.AlertLevel.WARN,
120
+ )
121
+
122
+ if math.isnan(loss):
123
+ trackio.alert(
124
+ title="NaN loss",
125
+ text="Loss became NaN — training is broken",
126
+ level=trackio.AlertLevel.ERROR,
127
+ )
128
+ break
129
+
130
+ trackio.finish()
131
+ ```
132
+
133
+ ### 2. Monitor Alerts
134
+
135
+ Alerts are automatically printed to the terminal when fired. If the agent is watching the training script's output (e.g. running in the foreground or tailing logs), it will see alerts immediately — no polling needed.
136
+
137
+ For background or detached runs, poll for alerts via CLI:
138
+
139
+ ```bash
140
+ # Poll for alerts (run periodically)
141
+ trackio list alerts --project hyperparam-sweep --json --since "2025-06-01T00:00:00"
142
+ ```
143
+
144
+ ### 3. Inspect Metrics Around the Alert
145
+
146
+ When an alert fires, use `trackio get snapshot` to see all metrics at that point:
147
+
148
+ ```bash
149
+ # Alert fired at step 200 — get all metrics in a ±5 step window
150
+ trackio get snapshot --project hyperparam-sweep --run run-1 --around 200 --window 5 --json
151
+
152
+ # Or inspect a single metric around the alert's timestamp
153
+ trackio get metric --project hyperparam-sweep --run run-1 --metric loss --around 200 --window 10 --json
154
+ ```
155
+
156
+ ### 4. React and Iterate
157
+
158
+ Based on alerts:
159
+ - **ERROR alerts** → stop the run, adjust hyperparameters, relaunch
160
+ - **WARN alerts** → inspect metrics with `trackio get snapshot ...`, decide whether to intervene
161
+ - **INFO alerts** → note progress, continue monitoring
162
+
163
+ ### 5. Compare Across Runs
164
+
165
+ ```bash
166
+ # Check metrics from previous runs
167
+ trackio get run --project hyperparam-sweep --run run-1 --json
168
+ trackio get metric --project hyperparam-sweep --run run-1 --metric loss --json
169
+
170
+ # Launch new run with adjusted config
171
+ python train.py --lr 5e-5
172
+ ```
173
+
174
+ ## Using Alerts with Transformers / TRL
175
+
176
+ When using `report_to="trackio"`, you don't control the training loop directly. Use a `TrainerCallback` to fire alerts:
177
+
178
+ ```python
179
+ from transformers import TrainerCallback
180
+
181
+ class AlertCallback(TrainerCallback):
182
+ def on_log(self, args, state, control, logs=None, **kwargs):
183
+ if "trackio" not in args.report_to:
184
+ return
185
+ if logs and "loss" in logs:
186
+ if logs["loss"] > 5.0 and state.global_step > 100:
187
+ trackio.alert(
188
+ title="High loss",
189
+ text=f"Loss {logs['loss']:.4f} at step {state.global_step}",
190
+ level=trackio.AlertLevel.ERROR,
191
+ )
192
+
193
+ trainer = SFTTrainer(
194
+ model=model,
195
+ args=SFTConfig(output_dir="./out", report_to="trackio"),
196
+ callbacks=[AlertCallback()],
197
+ ...
198
+ )
199
+ ```
.agents/skills/trackio/logging_metrics.md ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Logging Metrics with Trackio
2
+
3
+ **Trackio** is a lightweight, free experiment tracking library from Hugging Face. It provides a wandb-compatible API for logging metrics with local-first design.
4
+
5
+ - **GitHub**: [gradio-app/trackio](https://github.com/gradio-app/trackio)
6
+ - **Docs**: [huggingface.co/docs/trackio](https://huggingface.co/docs/trackio/index)
7
+
8
+ ## Installation
9
+
10
+ ```bash
11
+ pip install trackio
12
+ # or
13
+ uv pip install trackio
14
+ ```
15
+
16
+ ## Core API
17
+
18
+ ### Basic Usage
19
+
20
+ ```python
21
+ import trackio
22
+
23
+ # Initialize a run
24
+ trackio.init(
25
+ project="my-project",
26
+ config={"learning_rate": 0.001, "epochs": 10}
27
+ )
28
+
29
+ # Log metrics during training
30
+ for epoch in range(10):
31
+ loss = train_epoch()
32
+ trackio.log({"loss": loss, "epoch": epoch})
33
+
34
+ # Finalize the run
35
+ trackio.finish()
36
+ ```
37
+
38
+ ### Key Functions
39
+
40
+ | Function | Purpose |
41
+ |----------|---------|
42
+ | `trackio.init(...)` | Start a new tracking run |
43
+ | `trackio.log(dict)` | Log metrics (called repeatedly during training) |
44
+ | `trackio.finish()` | Finalize run and ensure all metrics are saved |
45
+ | `trackio.show()` | Launch the local dashboard |
46
+ | `trackio.sync(...)` | Sync local project to HF Space |
47
+
48
+ ## trackio.init() Parameters
49
+
50
+ ```python
51
+ trackio.init(
52
+ project="my-project", # Project name (groups runs together)
53
+ name="run-name", # Optional: name for this specific run
54
+ config={...}, # Hyperparameters and config to log
55
+ space_id="username/trackio", # Optional: sync to HF Space for remote dashboard
56
+ group="experiment-group", # Optional: group related runs
57
+ )
58
+ ```
59
+
60
+ ## Local vs Remote Dashboard
61
+
62
+ ### Local (Default)
63
+
64
+ By default, trackio stores metrics in a local SQLite database and runs the dashboard locally:
65
+
66
+ ```python
67
+ trackio.init(project="my-project")
68
+ # ... training ...
69
+ trackio.finish()
70
+
71
+ # Launch local dashboard
72
+ trackio.show()
73
+ ```
74
+
75
+ Or from terminal:
76
+ ```bash
77
+ trackio show --project my-project
78
+ ```
79
+
80
+ ### Remote (HF Space)
81
+
82
+ Pass `space_id` to sync metrics to a Hugging Face Space for persistent, shareable dashboards:
83
+
84
+ ```python
85
+ trackio.init(
86
+ project="my-project",
87
+ space_id="username/trackio" # Auto-creates Space if it doesn't exist
88
+ )
89
+ ```
90
+
91
+ ⚠️ **For remote training** (cloud GPUs, HF Jobs, etc.): Always use `space_id` since local storage is lost when the instance terminates.
92
+
93
+ ### Sync Local to Remote
94
+
95
+ Sync existing local projects to a Space:
96
+
97
+ ```python
98
+ trackio.sync(project="my-project", space_id="username/my-experiments")
99
+ ```
100
+
101
+ ## wandb Compatibility
102
+
103
+ Trackio is API-compatible with wandb. Drop-in replacement:
104
+
105
+ ```python
106
+ import trackio as wandb
107
+
108
+ wandb.init(project="my-project")
109
+ wandb.log({"loss": 0.5})
110
+ wandb.finish()
111
+ ```
112
+
113
+ ## TRL Integration
114
+
115
+ When using TRL trainers, set `report_to="trackio"` for automatic metric logging:
116
+
117
+ ```python
118
+ from trl import SFTConfig, SFTTrainer
119
+ import trackio
120
+
121
+ trackio.init(
122
+ project="sft-training",
123
+ space_id="username/trackio",
124
+ config={"model": "Qwen/Qwen2.5-0.5B", "dataset": "trl-lib/Capybara"}
125
+ )
126
+
127
+ config = SFTConfig(
128
+ output_dir="./output",
129
+ report_to="trackio", # Automatic metric logging
130
+ # ... other config
131
+ )
132
+
133
+ trainer = SFTTrainer(model=model, args=config, ...)
134
+ trainer.train()
135
+ trackio.finish()
136
+ ```
137
+
138
+ ## What Gets Logged
139
+
140
+ With TRL/Transformers integration, trackio automatically captures:
141
+ - Training loss
142
+ - Learning rate
143
+ - Eval metrics
144
+ - Training throughput
145
+
146
+ For manual logging, log any numeric metrics:
147
+
148
+ ```python
149
+ trackio.log({
150
+ "train_loss": 0.5,
151
+ "train_accuracy": 0.85,
152
+ "val_loss": 0.4,
153
+ "val_accuracy": 0.88,
154
+ "epoch": 1
155
+ })
156
+ ```
157
+
158
+ ## Grouping Runs
159
+
160
+ Use `group` to organize related experiments in the dashboard sidebar:
161
+
162
+ ```python
163
+ # Group by experiment type
164
+ trackio.init(project="my-project", name="baseline-v1", group="baseline")
165
+ trackio.init(project="my-project", name="augmented-v1", group="augmented")
166
+
167
+ # Group by hyperparameter
168
+ trackio.init(project="hyperparam-sweep", name="lr-0.001", group="lr_0.001")
169
+ trackio.init(project="hyperparam-sweep", name="lr-0.01", group="lr_0.01")
170
+ ```
171
+
172
+ ## Configuration Best Practices
173
+
174
+ Keep config minimal — only log what's useful for comparing runs:
175
+
176
+ ```python
177
+ trackio.init(
178
+ project="qwen-sft-capybara",
179
+ name="baseline-lr2e5",
180
+ config={
181
+ "model": "Qwen/Qwen2.5-0.5B",
182
+ "dataset": "trl-lib/Capybara",
183
+ "learning_rate": 2e-5,
184
+ "num_epochs": 3,
185
+ "batch_size": 8,
186
+ }
187
+ )
188
+ ```
189
+
190
+ ## Embedding Dashboards
191
+
192
+ Embed Space dashboards in websites with query parameters:
193
+
194
+ ```html
195
+ <iframe
196
+ src="https://username-trackio.hf.space/?project=my-project&metrics=train_loss,val_loss&sidebar=hidden"
197
+ style="width:1600px; height:500px; border:0;">
198
+ </iframe>
199
+ ```
200
+
201
+ Query parameters:
202
+ - `project`: Filter to specific project
203
+ - `metrics`: Comma-separated metric names to show
204
+ - `sidebar`: `hidden` or `collapsed`
205
+ - `smoothing`: 0-20 (smoothing slider value)
206
+ - `xmin`, `xmax`: X-axis limits
.agents/skills/trackio/retrieving_metrics.md ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Retrieving Metrics with Trackio CLI
2
+
3
+ The `trackio` CLI provides direct terminal access to query Trackio experiment tracking data without needing to start the MCP server. Commands work against local data by default, or against a remote HF Space when `--space` is provided.
4
+
5
+ ## Quick Command Reference
6
+
7
+ | Task | Command |
8
+ |------|---------|
9
+ | List projects | `trackio list projects` |
10
+ | List runs | `trackio list runs --project <name>` |
11
+ | List metrics | `trackio list metrics --project <name> --run <name>` |
12
+ | List system metrics | `trackio list system-metrics --project <name> --run <name>` |
13
+ | List alerts | `trackio list alerts --project <name> [--run <name>] [--level <level>] [--since <timestamp>]` |
14
+ | Get project summary | `trackio get project --project <name>` |
15
+ | Get run summary | `trackio get run --project <name> --run <name>` |
16
+ | Get metric values | `trackio get metric --project <name> --run <name> --metric <name>` |
17
+ | Get metric at step | `trackio get metric ... --metric <name> --step <N>` |
18
+ | Get metric around step | `trackio get metric ... --metric <name> --around <N> --window <W>` |
19
+ | Get all metrics snapshot | `trackio get snapshot --project <name> --run <name> --step <N>` |
20
+ | Get system metrics | `trackio get system-metric --project <name> --run <name>` |
21
+ | Run direct SQL | `trackio query project --project <name> --sql "SELECT ..."` |
22
+ | Query remote Space | `trackio list projects --space <space_id_or_url>` |
23
+ | Show dashboard | `trackio show [--project <name>]` |
24
+ | Sync to Space | `trackio sync --project <name> --space-id <space_id>` |
25
+
26
+ ## Core Commands
27
+
28
+ ### List Commands
29
+
30
+ ```bash
31
+ trackio list projects # List all projects
32
+ trackio list projects --json # JSON output
33
+
34
+ trackio list runs --project <name> # List runs in project
35
+ trackio list runs --project <name> --json # JSON output
36
+
37
+ trackio list metrics --project <name> --run <name> # List metrics for run
38
+ trackio list metrics --project <name> --run <name> --json
39
+
40
+ trackio list system-metrics --project <name> --run <name> # List system metrics
41
+ trackio list system-metrics --project <name> --run <name> --json
42
+
43
+ trackio list alerts --project <name> # List alerts
44
+ trackio list alerts --project <name> --run <name> --json # Filter by run
45
+ trackio list alerts --project <name> --level error --json # Filter by level
46
+ trackio list alerts --project <name> --json --since <ts> # Poll since timestamp
47
+ ```
48
+
49
+ ### Get Commands
50
+
51
+ ```bash
52
+ trackio get project --project <name> # Project summary
53
+ trackio get project --project <name> --json # JSON output
54
+
55
+ trackio get run --project <name> --run <name> # Run summary
56
+ trackio get run --project <name> --run <name> --json
57
+
58
+ trackio get metric --project <name> --run <name> --metric <name> # Metric values
59
+ trackio get metric --project <name> --run <name> --metric <name> --json
60
+ trackio get metric ... --metric <name> --step 200 # At exact step
61
+ trackio get metric ... --metric <name> --around 200 --window 10 # ±10 steps
62
+ trackio get metric ... --metric <name> --at-time <ts> --window 60 # ±60 seconds
63
+
64
+ trackio get snapshot --project <name> --run <name> --step 200 --json # All metrics at step
65
+ trackio get snapshot --project <name> --run <name> --around 200 --window 5 --json # Window
66
+ trackio get snapshot --project <name> --run <name> --at-time <ts> --window 60 --json
67
+
68
+ trackio get system-metric --project <name> --run <name> # All system metrics
69
+ trackio get system-metric --project <name> --run <name> --metric <name> # Specific metric
70
+ trackio get system-metric --project <name> --run <name> --json
71
+ ```
72
+
73
+ ### Query Command
74
+
75
+ ```bash
76
+ trackio query project --project <name> --sql "SELECT name FROM sqlite_master WHERE type = 'table'"
77
+ trackio query project --project <name> --sql "PRAGMA table_info(metrics)" --json
78
+ trackio query project --project <name> --sql "SELECT run_name, MAX(step) AS last_step FROM metrics GROUP BY run_name"
79
+ ```
80
+
81
+ ### Remote Space Queries
82
+
83
+ All `list`, `get`, and `query` commands support querying a remote HF Space with `--space`:
84
+
85
+ ```bash
86
+ trackio list projects --space user/my-space # Space ID
87
+ trackio list projects --space https://user-my-space.hf.space # Space URL
88
+ trackio get metric --project <name> --run <name> --metric loss --space user/my-space
89
+ trackio query project --project <name> --sql "SELECT COUNT(*) AS num_alerts FROM alerts" --space user/my-space
90
+ trackio list projects --space user/private-space --hf-token hf_xxx # Private Space
91
+ ```
92
+
93
+ ### Dashboard Commands
94
+
95
+ ```bash
96
+ trackio show # Launch dashboard
97
+ trackio show --project <name> # Load specific project
98
+ trackio show --theme <theme> # Custom theme
99
+ trackio show --mcp-server # Enable MCP server
100
+ trackio show --color-palette "#FF0000,#00FF00" # Custom colors
101
+ ```
102
+
103
+ ### Sync Commands
104
+
105
+ ```bash
106
+ trackio sync --project <name> --space-id <space_id> # Sync to HF Space
107
+ trackio sync --project <name> --space-id <space_id> --private # Private space
108
+ trackio sync --project <name> --space-id <space_id> --force # Overwrite
109
+ ```
110
+
111
+ ## Output Formats
112
+
113
+ All `list`, `get`, and `query` commands support two output formats:
114
+
115
+ - **Human-readable** (default): Formatted text for terminal viewing
116
+ - **JSON** (with `--json` flag): Structured JSON for programmatic use
117
+
118
+ ## Common Patterns
119
+
120
+ ### Discover Projects and Runs
121
+
122
+ ```bash
123
+ # List all available projects
124
+ trackio list projects
125
+
126
+ # List runs in a project
127
+ trackio list runs --project my-project
128
+
129
+ # Get project overview
130
+ trackio get project --project my-project --json
131
+ ```
132
+
133
+ ### Inspect Run Details
134
+
135
+ ```bash
136
+ # Get run summary with all metrics
137
+ trackio get run --project my-project --run my-run --json
138
+
139
+ # List available metrics
140
+ trackio list metrics --project my-project --run my-run
141
+
142
+ # Get specific metric values
143
+ trackio get metric --project my-project --run my-run --metric loss --json
144
+ ```
145
+
146
+ ### Query System Metrics
147
+
148
+ ```bash
149
+ # List system metrics (GPU, etc.)
150
+ trackio list system-metrics --project my-project --run my-run
151
+
152
+ # Get all system metric data
153
+ trackio get system-metric --project my-project --run my-run --json
154
+
155
+ # Get specific system metric
156
+ trackio get system-metric --project my-project --run my-run --metric gpu_utilization --json
157
+ ```
158
+
159
+ ### Automation Scripts
160
+
161
+ ```bash
162
+ # Extract latest metric value
163
+ LATEST_LOSS=$(trackio get metric --project my-project --run my-run --metric loss --json | jq -r '.values[-1].value')
164
+
165
+ # Export run summary to file
166
+ trackio get run --project my-project --run my-run --json > run_summary.json
167
+
168
+ # Filter runs with jq
169
+ trackio list runs --project my-project --json | jq '.runs[] | select(startswith("train"))'
170
+
171
+ # Run a direct SQL aggregate
172
+ trackio query project --project my-project --sql "SELECT run_name, MAX(step) AS last_step FROM metrics GROUP BY run_name" --json
173
+ ```
174
+
175
+ ### LLM Agent Workflow
176
+
177
+ ```bash
178
+ # 1. Discover available projects
179
+ trackio list projects --json
180
+
181
+ # 2. Explore project structure
182
+ trackio get project --project my-project --json
183
+
184
+ # 3. Inspect specific run
185
+ trackio get run --project my-project --run my-run --json
186
+
187
+ # 4. Query metric values
188
+ trackio get metric --project my-project --run my-run --metric accuracy --json
189
+
190
+ # 5. Poll for alerts (use --since for efficient incremental polling)
191
+ trackio list alerts --project my-project --json --since "2025-06-01T00:00:00"
192
+
193
+ # 6. When an alert fires at step N, get all metrics around that point
194
+ trackio get snapshot --project my-project --run my-run --around 200 --window 5 --json
195
+
196
+ # 7. Fall back to direct SQL for one-off inspection
197
+ trackio query project --project my-project --sql "SELECT timestamp, run_name, level, title FROM alerts ORDER BY timestamp DESC LIMIT 20" --json
198
+ ```
199
+
200
+ ## Error Handling
201
+
202
+ Commands validate inputs and return clear errors:
203
+
204
+ - Missing project: `Error: Project '<name>' not found.`
205
+ - Missing run: `Error: Run '<name>' not found in project '<project>'.`
206
+ - Missing metric: `Error: Metric '<name>' not found in run '<run>' of project '<project>'.`
207
+
208
+ All errors exit with non-zero status code and write to stderr.
209
+
210
+ ## Key Options
211
+
212
+ - `--project`: Project name (required for most commands)
213
+ - `--run`: Run name (required for run-specific commands)
214
+ - `--metric`: Metric name (required for metric-specific commands)
215
+ - `--sql`: Read-only SQL query (for `trackio query`)
216
+ - `--json`: Output in JSON format instead of human-readable
217
+ - `--space`: HF Space ID (e.g. `user/space`) or Space URL to query remotely (for `list`/`get`/`query` commands)
218
+ - `--hf-token`: HF token for accessing private Spaces (for `list`/`get`/`query` commands with `--space`)
219
+ - `--step`: Exact step filter (for `get metric`, `get snapshot`)
220
+ - `--around`: Center step for window filter (for `get metric`, `get snapshot`)
221
+ - `--at-time`: Center ISO timestamp for window filter (for `get metric`, `get snapshot`)
222
+ - `--window`: Window size: ±steps for `--around`, ±seconds for `--at-time` (default: 10)
223
+ - `--level`: Alert level filter (`info`, `warn`, `error`) (for `list alerts`)
224
+ - `--since`: ISO timestamp to filter alerts after (for `list alerts`)
225
+ - `--theme`: Dashboard theme (for `show` command)
226
+ - `--mcp-server`: Enable MCP server mode (for `show` command)
227
+ - `--color-palette`: Comma-separated hex colors (for `show` command)
228
+ - `--private`: Create private Space (for `sync` command)
229
+ - `--force`: Overwrite existing database (for `sync` command)
230
+
231
+ ## JSON Output Structure
232
+
233
+ ### List Projects
234
+ ```json
235
+ {"projects": ["project1", "project2"]}
236
+ ```
237
+
238
+ ### List Runs
239
+ ```json
240
+ {"project": "my-project", "runs": ["run1", "run2"]}
241
+ ```
242
+
243
+ ### Project Summary
244
+ ```json
245
+ {
246
+ "project": "my-project",
247
+ "num_runs": 3,
248
+ "runs": ["run1", "run2", "run3"],
249
+ "last_activity": 100
250
+ }
251
+ ```
252
+
253
+ ### Run Summary
254
+ ```json
255
+ {
256
+ "project": "my-project",
257
+ "run": "my-run",
258
+ "num_logs": 50,
259
+ "metrics": ["loss", "accuracy"],
260
+ "config": {"learning_rate": 0.001},
261
+ "last_step": 49
262
+ }
263
+ ```
264
+
265
+ ### Metric Values
266
+ ```json
267
+ {
268
+ "project": "my-project",
269
+ "run": "my-run",
270
+ "metric": "loss",
271
+ "values": [
272
+ {"step": 0, "timestamp": "2024-01-01T00:00:00", "value": 0.5},
273
+ {"step": 1, "timestamp": "2024-01-01T00:01:00", "value": 0.4}
274
+ ]
275
+ }
276
+ ```
277
+
278
+ ### Query Result
279
+ ```json
280
+ {
281
+ "project": "my-project",
282
+ "query": "SELECT name FROM sqlite_master WHERE type = 'table' ORDER BY name",
283
+ "columns": ["name"],
284
+ "rows": [
285
+ {"name": "alerts"},
286
+ {"name": "configs"},
287
+ {"name": "metrics"}
288
+ ],
289
+ "row_count": 3
290
+ }
291
+ ```
292
+
293
+ ## References
294
+
295
+ - **Complete CLI documentation**: See [docs/source/cli_commands.md](docs/source/cli_commands.md)
296
+ - **Storage schema and direct SQL**: See [storage_schema.md](storage_schema.md)
297
+ - **API and MCP Server**: See [docs/source/api_mcp_server.md](docs/source/api_mcp_server.md)
298
+
.agents/skills/trackio/storage_schema.md ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Trackio Storage Schema and Direct SQL
2
+
3
+ Use this reference when you need to inspect Trackio data directly instead of going through higher-level `trackio list` or `trackio get` commands.
4
+
5
+ ## Where Data Is Stored
6
+
7
+ - Local project databases live in `TRACKIO_DIR`, which defaults to `~/.cache/huggingface/trackio`.
8
+ - Each project is stored in its own SQLite file: `{project}.db`.
9
+ - Media files live under `TRACKIO_DIR/media/`.
10
+ - Parquet files are derived exports written from SQLite for syncing and static Spaces.
11
+
12
+ ## SQLite Tables
13
+
14
+ Trackio defines its live schema in `trackio/sqlite_storage.py` inside `SQLiteStorage.init_db()`.
15
+
16
+ ### `metrics`
17
+
18
+ - `id`: integer primary key
19
+ - `timestamp`: ISO timestamp
20
+ - `run_name`: run identifier
21
+ - `step`: integer step
22
+ - `metrics`: JSON text payload
23
+ - `log_id`: optional deduplication key
24
+ - `space_id`: optional pending-sync marker
25
+
26
+ Indexes:
27
+
28
+ - `(run_name, step)`
29
+ - `(run_name, timestamp)`
30
+ - unique partial index on `log_id`
31
+ - partial index on `space_id`
32
+
33
+ ### `configs`
34
+
35
+ - `id`: integer primary key
36
+ - `run_name`: run identifier
37
+ - `config`: JSON text payload
38
+ - `created_at`: ISO timestamp
39
+
40
+ Constraints:
41
+
42
+ - unique `run_name`
43
+ - index on `run_name`
44
+
45
+ ### `system_metrics`
46
+
47
+ - `id`: integer primary key
48
+ - `timestamp`: ISO timestamp
49
+ - `run_name`: run identifier
50
+ - `metrics`: JSON text payload
51
+ - `log_id`: optional deduplication key
52
+ - `space_id`: optional pending-sync marker
53
+
54
+ Indexes:
55
+
56
+ - `(run_name, timestamp)`
57
+ - unique partial index on `log_id`
58
+ - partial index on `space_id`
59
+
60
+ ### `project_metadata`
61
+
62
+ - `key`: primary key
63
+ - `value`: metadata value
64
+
65
+ ### `pending_uploads`
66
+
67
+ - `id`
68
+ - `space_id`
69
+ - `run_name`
70
+ - `step`
71
+ - `file_path`
72
+ - `relative_path`
73
+ - `created_at`
74
+
75
+ ### `alerts`
76
+
77
+ - `id`
78
+ - `timestamp`
79
+ - `run_name`
80
+ - `title`
81
+ - `text`
82
+ - `level`
83
+ - `step`
84
+ - `alert_id`
85
+
86
+ Indexes:
87
+
88
+ - `run_name`
89
+ - `timestamp`
90
+ - unique partial index on `alert_id`
91
+
92
+ ## Parquet Layout
93
+
94
+ Trackio flattens JSON blobs when exporting parquet:
95
+
96
+ - `{project}.parquet` comes from `metrics`
97
+ - `{project}_system.parquet` comes from `system_metrics`
98
+ - `{project}_configs.parquet` comes from `configs`
99
+
100
+ Static export layout:
101
+
102
+ - `metrics.parquet`
103
+ - `aux/system_metrics.parquet`
104
+ - `aux/configs.parquet`
105
+ - `runs.json`
106
+ - `settings.json`
107
+
108
+ The flattened parquet files keep structural columns such as `timestamp`, `run_name`, and `step`, then add one column per JSON key found in the source payload.
109
+
110
+ ## Direct SQL With The CLI
111
+
112
+ Use `trackio query` for read-only SQL:
113
+
114
+ ```bash
115
+ trackio query project --project my-project --sql "SELECT name FROM sqlite_master WHERE type = 'table' ORDER BY name" --json
116
+ trackio query project --project my-project --sql "PRAGMA table_info(metrics)"
117
+ trackio query project --project my-project --sql "SELECT run_name, MAX(step) AS last_step FROM metrics GROUP BY run_name ORDER BY last_step DESC"
118
+ ```
119
+
120
+ Remote query works too:
121
+
122
+ ```bash
123
+ trackio query project --project my-project --sql "SELECT COUNT(*) AS num_alerts FROM alerts" --space username/my-space --json
124
+ ```
125
+
126
+ `trackio query` accepts read-only `SELECT`, `WITH`, and safe schema `PRAGMA` queries.
127
+
128
+ ## Common Query Patterns
129
+
130
+ Recent alerts:
131
+
132
+ ```bash
133
+ trackio query project --project my-project --sql "SELECT timestamp, run_name, level, title, step FROM alerts ORDER BY timestamp DESC LIMIT 20"
134
+ ```
135
+
136
+ Latest step per run:
137
+
138
+ ```bash
139
+ trackio query project --project my-project --sql "SELECT run_name, MAX(step) AS last_step FROM metrics GROUP BY run_name ORDER BY last_step DESC"
140
+ ```
141
+
142
+ Recent configs:
143
+
144
+ ```bash
145
+ trackio query project --project my-project --sql "SELECT run_name, created_at, config FROM configs ORDER BY created_at DESC"
146
+ ```
147
+
148
+ Schema inspection:
149
+
150
+ ```bash
151
+ trackio query project --project my-project --sql "PRAGMA index_list(metrics)"
152
+ ```
153
+
154
+ ## Agent Guidance
155
+
156
+ - Start with `trackio list projects --json` if you do not know the project name yet.
157
+ - Use `trackio get` for common summaries and metric retrieval.
158
+ - Fall back to `trackio query` when you need one-off aggregates, joins, or schema introspection.
159
+ - Prefer `--json` when another agent or script needs to consume the result.
.codex/skills/openenv-cli ADDED
@@ -0,0 +1 @@
 
 
1
+ ../../.agents/skills/openenv-cli
.codex/skills/trackio ADDED
@@ -0,0 +1 @@
 
 
1
+ ../../.agents/skills/trackio