adityss commited on
Commit
0361922
·
1 Parent(s): e3fbc9c

feat: add task weights to configuration and implement LLM-based inference agent

Browse files
Files changed (3) hide show
  1. README.md +5 -3
  2. openenv.yaml +15 -0
  3. python/inference.py +12 -5
README.md CHANGED
@@ -87,16 +87,18 @@ python inference.py --fast-mode --episodes 1
87
 
88
  You can run the same entrypoint directly with `python python/inference.py` (e.g. `python python/inference.py --fast-mode`); flags match the root `inference.py` wrapper.
89
 
90
- **LLM baseline** (requires Hugging Face or other OpenAI-compatible API credentials):
91
 
92
  ```bash
93
  export ENV_URL=http://localhost:7860
94
  export API_BASE_URL=https://router.huggingface.co/v1
95
  export MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
96
- export HF_TOKEN=your_token_here
97
  python inference.py --episodes 1 --llm-every 4
98
  ```
99
 
 
 
100
  Results are written to `baseline_scores.json` by default (`--output` to change).
101
 
102
  ---
@@ -160,7 +162,7 @@ There is **no** `--judge-mode` flag in this repository. Use the modes below.
160
  | Mode | Command pattern | Behavior |
161
  |------|-----------------|----------|
162
  | **Fast (heuristic)** | `python inference.py --fast-mode` | No LLM calls; deterministic given env seed; fastest for CI or smoke tests. |
163
- | **Default LLM** | `python inference.py` | Uses OpenAI-compatible API (`API_BASE_URL`, `MODEL_NAME`, `HF_TOKEN`); default `--llm-every 4` reuses each LLM action for 4 steps to limit API cost. |
164
  | **Recommended for automated evaluation / judging** | `python inference.py --fast-mode --episodes 1` | Recommended when automated pipelines need **reproducibility** and **no external API** dependency. |
165
 
166
  Other useful flags:
 
87
 
88
  You can run the same entrypoint directly with `python python/inference.py` (e.g. `python python/inference.py --fast-mode`); flags match the root `inference.py` wrapper.
89
 
90
+ **LLM baseline** (requires any OpenAI-compatible API credentials — HuggingFace, Groq, etc.):
91
 
92
  ```bash
93
  export ENV_URL=http://localhost:7860
94
  export API_BASE_URL=https://router.huggingface.co/v1
95
  export MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
96
+ export OPENAI_API_KEY=your_token_here # or HF_TOKEN=your_token_here
97
  python inference.py --episodes 1 --llm-every 4
98
  ```
99
 
100
+ > **Note:** The script accepts either `OPENAI_API_KEY` (hackathon standard) or `HF_TOKEN` (HuggingFace convention). You do **not** need a paid OpenAI key — any OpenAI-compatible provider works.
101
+
102
  Results are written to `baseline_scores.json` by default (`--output` to change).
103
 
104
  ---
 
162
  | Mode | Command pattern | Behavior |
163
  |------|-----------------|----------|
164
  | **Fast (heuristic)** | `python inference.py --fast-mode` | No LLM calls; deterministic given env seed; fastest for CI or smoke tests. |
165
+ | **Default LLM** | `python inference.py` | Uses OpenAI Python client (`API_BASE_URL`, `MODEL_NAME`, `OPENAI_API_KEY` or `HF_TOKEN`); default `--llm-every 4` reuses each LLM action for 4 steps to limit API cost. |
166
  | **Recommended for automated evaluation / judging** | `python inference.py --fast-mode --episodes 1` | Recommended when automated pipelines need **reproducibility** and **no external API** dependency. |
167
 
168
  Other useful flags:
openenv.yaml CHANGED
@@ -111,14 +111,25 @@ tasks:
111
  name: "Cost Minimization"
112
  description: "Minimize total energy cost over a 24-hour episode with no process constraints."
113
  difficulty: "easy"
 
 
114
  - id: 2
115
  name: "Constrained Temperature Management"
116
  description: "Minimize cost while keeping indoor temperature within ±2°C of setpoint at all times."
117
  difficulty: "medium"
 
 
 
118
  - id: 3
119
  name: "Full Demand-Response with Batch Scheduling"
120
  description: "Minimize cost, maintain temperature, respond to grid stress events, schedule all batch jobs, and minimize carbon."
121
  difficulty: "hard"
 
 
 
 
 
 
122
 
123
  endpoints:
124
  health:
@@ -145,3 +156,7 @@ endpoints:
145
  tasks:
146
  path: /tasks
147
  method: GET
 
 
 
 
 
111
  name: "Cost Minimization"
112
  description: "Minimize total energy cost over a 24-hour episode with no process constraints."
113
  difficulty: "easy"
114
+ weights:
115
+ cost: 1.0
116
  - id: 2
117
  name: "Constrained Temperature Management"
118
  description: "Minimize cost while keeping indoor temperature within ±2°C of setpoint at all times."
119
  difficulty: "medium"
120
+ weights:
121
+ cost: 0.6
122
+ temperature: 0.4
123
  - id: 3
124
  name: "Full Demand-Response with Batch Scheduling"
125
  description: "Minimize cost, maintain temperature, respond to grid stress events, schedule all batch jobs, and minimize carbon."
126
  difficulty: "hard"
127
+ weights:
128
+ cost: 0.28
129
+ temperature: 0.20
130
+ grid_response: 0.20
131
+ batch_deadline: 0.12
132
+ carbon: 0.20
133
 
134
  endpoints:
135
  health:
 
156
  tasks:
157
  path: /tasks
158
  method: GET
159
+ metrics:
160
+ path: /metrics
161
+ method: GET
162
+
python/inference.py CHANGED
@@ -2,13 +2,19 @@
2
  GridMind-RL Baseline Inference Script
3
  --------------------------------------
4
  Runs an LLM agent against all 3 tasks for N episodes each.
5
- Uses OpenAI-compatible API via API_BASE_URL / MODEL_NAME / HF_TOKEN environment variables.
 
 
 
 
 
6
 
7
  Usage:
 
8
  export MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
9
- export HF_TOKEN=hf_xxxx
10
  python inference.py
11
- # or: python python/inference.py [--episodes 1] [--llm-every 4] [--fast-mode]
12
  """
13
 
14
  from __future__ import annotations
@@ -28,7 +34,8 @@ from openai import OpenAI
28
  ENV_URL = os.getenv("ENV_URL", "http://localhost:7860")
29
  MODEL_NAME = os.getenv("MODEL_NAME", "meta-llama/Llama-3.1-8B-Instruct")
30
  API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
31
- HF_TOKEN = os.getenv("HF_TOKEN", "")
 
32
  DEFAULT_EPISODES = 1
33
  DEFAULT_SEED_BASE = 1000
34
  MAX_RETRIES = 3
@@ -124,7 +131,7 @@ class LLMAgent:
124
  def __init__(self):
125
  self.client = OpenAI(
126
  base_url=API_BASE_URL,
127
- api_key=HF_TOKEN if HF_TOKEN else "none",
128
  )
129
  self.model = MODEL_NAME
130
  self.fallback_mode = False
 
2
  GridMind-RL Baseline Inference Script
3
  --------------------------------------
4
  Runs an LLM agent against all 3 tasks for N episodes each.
5
+ Uses the OpenAI Python client pointed at any OpenAI-compatible endpoint.
6
+
7
+ Required environment variables:
8
+ API_BASE_URL — The API endpoint for the LLM (default: HuggingFace router)
9
+ MODEL_NAME — The model identifier to use for inference
10
+ OPENAI_API_KEY or HF_TOKEN — API key for authentication (any provider)
11
 
12
  Usage:
13
+ export API_BASE_URL=https://router.huggingface.co/v1
14
  export MODEL_NAME=meta-llama/Llama-3.1-8B-Instruct
15
+ export OPENAI_API_KEY=hf_xxxx # or HF_TOKEN=hf_xxxx
16
  python inference.py
17
+ # or: python inference.py --fast-mode --episodes 1
18
  """
19
 
20
  from __future__ import annotations
 
34
  ENV_URL = os.getenv("ENV_URL", "http://localhost:7860")
35
  MODEL_NAME = os.getenv("MODEL_NAME", "meta-llama/Llama-3.1-8B-Instruct")
36
  API_BASE_URL = os.getenv("API_BASE_URL", "https://router.huggingface.co/v1")
37
+ # Accept OPENAI_API_KEY (hackathon standard) or HF_TOKEN (HuggingFace convention)
38
+ OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "") or os.getenv("HF_TOKEN", "")
39
  DEFAULT_EPISODES = 1
40
  DEFAULT_SEED_BASE = 1000
41
  MAX_RETRIES = 3
 
131
  def __init__(self):
132
  self.client = OpenAI(
133
  base_url=API_BASE_URL,
134
+ api_key=OPENAI_API_KEY if OPENAI_API_KEY else "none",
135
  )
136
  self.model = MODEL_NAME
137
  self.fallback_mode = False