Spaces:

Imsachin010
/

openenv-workflow-agent

Sleeping

App Files Files Community

openenv-workflow-agent / README.md

Imsachin010

restore hf spaces config block

aab83b4 about 1 month ago

preview code

raw

history blame contribute delete

4.3 kB

metadata

title: Openenv Workflow Agent
emoji: 📈
colorFrom: green
colorTo: green
sdk: docker
pinned: false
license: mit

🧠 OpenEnv Workflow Agent — Decision-Making Under Uncertainty

🚀 Overview

We present a real-world OpenEnv environment that simulates workflow management tasks such as email triage, scheduling, and task handling under partial observability.

Unlike typical environments, this benchmark focuses on a critical but underexplored capability:

🔥 Cost-aware information gathering in sequential decision-making

Agents must decide:

When to act immediately
When to request additional information
Whether the cost of uncertainty reduction is justified

🎯 Why This Matters

Modern AI agents (LLMs, assistants, copilots) operate in uncertain environments:

Emails are ambiguous
User intent is hidden
Context is incomplete

Our environment models this realistically by enforcing:

❗ Incorrect actions under uncertainty → penalized
❗ Information gathering → beneficial but costly
❗ Multi-step reasoning required for optimal decisions

🧠 Core Idea

We introduce a POMDP-style workflow environment where:

The true state is partially hidden
Agents must actively reduce uncertainty
Information acquisition has a non-zero cost

Key Property:

An optimal agent follows:

“Request information only when expected benefit exceeds cost.”

⚙️ Environment Design

🔹 State

Emails (observed)
Tasks & calendar (observed)
Hidden attributes:
- true intent
- urgency
- missing information

🔹 Actions

classify
reply
schedule
request_info
archive
prioritize

🔹 Reward Function

[ r_t = r_{correct} + r_{progress} - r_{cost} - r_{penalty} ]

Correct action → +0.3
Task progress → +0.2
Step penalty → −0.01
Information request cost → −0.05
Incorrect action → −0.2

🧪 Tasks

🟢 Easy

Clear intent
Single-step decision

🟡 Medium

Multi-step workflow
Requires sequencing

🔴 Hard

Ambiguous input
Requires information gathering before acting

📊 Baseline Results


easy:   1.00
medium: 0.50
hard:   0.13

🔍 Interpretation

Baseline performs well on simple tasks
Fails on ambiguous scenarios
Demonstrates need for information-aware policies

🔥 Key Insight

Standard agents fail because they act too early under uncertainty.

Agents that act immediately under uncertainty fail. Agents that strategically gather information succeed.

This environment makes that tradeoff explicit and measurable.

Our environment exposes this failure mode clearly.

🧩 Novel Contribution

We introduce:

✅ Cost-sensitive information gathering

Asking questions is beneficial but not free

✅ Enforced uncertainty

Actions without information are penalized

✅ Sequential dependency

Early decisions affect future rewards

🧪 Validation

We verify:

✔ Classification fails under missing information
✔ Requesting info enables correct decisions
✔ Tradeoff emerges between cost and accuracy

📦 Project Structure


app/
tasks/
graders/
baseline/
scripts/
openenv.yaml
Dockerfile
inference.py

▶️ Run Locally

You can pull the pre-built Docker image directly from Docker Hub and run it:

docker pull imsachin010/openenv-workflow-agent:latest
docker run -d -p 7860:7860 --name openenv-agent imsachin010/openenv-workflow-agent:latest

Test endpoint:

curl -X POST http://localhost:7860/reset

🤖 Inference

Run the inference script inside the environment:

python -m inference

Outputs:

[START]
[STEP]
[END]

🧠 Conclusion

This environment highlights a key gap in current agents:

❗ They do not reason about when to gather information

We provide a benchmark to evaluate and improve:

decision-making under uncertainty
information-seeking behavior
sequential reasoning

🏁 Submission Notes

✔ Fully OpenEnv compliant
✔ Deterministic graders
✔ Reproducible via Docker
✔ HF Space endpoint available