Spaces:
Sleeping
title: Openenv Workflow Agent
emoji: π
colorFrom: green
colorTo: green
sdk: docker
pinned: false
license: mit
π§ OpenEnv Workflow Agent β Decision-Making Under Uncertainty
π Overview
We present a real-world OpenEnv environment that simulates workflow management tasks such as email triage, scheduling, and task handling under partial observability.
Unlike typical environments, this benchmark focuses on a critical but underexplored capability:
π₯ Cost-aware information gathering in sequential decision-making
Agents must decide:
- When to act immediately
- When to request additional information
- Whether the cost of uncertainty reduction is justified
π― Why This Matters
Modern AI agents (LLMs, assistants, copilots) operate in uncertain environments:
- Emails are ambiguous
- User intent is hidden
- Context is incomplete
Our environment models this realistically by enforcing:
- β Incorrect actions under uncertainty β penalized
- β Information gathering β beneficial but costly
- β Multi-step reasoning required for optimal decisions
π§ Core Idea
We introduce a POMDP-style workflow environment where:
- The true state is partially hidden
- Agents must actively reduce uncertainty
- Information acquisition has a non-zero cost
Key Property:
An optimal agent follows:
βRequest information only when expected benefit exceeds cost.β
βοΈ Environment Design
πΉ State
- Emails (observed)
- Tasks & calendar (observed)
- Hidden attributes:
- true intent
- urgency
- missing information
πΉ Actions
classifyreplyschedulerequest_infoarchiveprioritize
πΉ Reward Function
[ r_t = r_{correct} + r_{progress} - r_{cost} - r_{penalty} ]
- Correct action β +0.3
- Task progress β +0.2
- Step penalty β β0.01
- Information request cost β β0.05
- Incorrect action β β0.2
π§ͺ Tasks
π’ Easy
- Clear intent
- Single-step decision
π‘ Medium
- Multi-step workflow
- Requires sequencing
π΄ Hard
- Ambiguous input
- Requires information gathering before acting
π Baseline Results
easy: 1.00
medium: 0.50
hard: 0.13
π Interpretation
- Baseline performs well on simple tasks
- Fails on ambiguous scenarios
- Demonstrates need for information-aware policies
π₯ Key Insight
Standard agents fail because they act too early under uncertainty.
Agents that act immediately under uncertainty fail. Agents that strategically gather information succeed.
This environment makes that tradeoff explicit and measurable.
Our environment exposes this failure mode clearly.
π§© Novel Contribution
We introduce:
β Cost-sensitive information gathering
- Asking questions is beneficial but not free
β Enforced uncertainty
- Actions without information are penalized
β Sequential dependency
- Early decisions affect future rewards
π§ͺ Validation
We verify:
- β Classification fails under missing information
- β Requesting info enables correct decisions
- β Tradeoff emerges between cost and accuracy
π¦ Project Structure
app/
tasks/
graders/
baseline/
scripts/
openenv.yaml
Dockerfile
inference.py
βΆοΈ Run Locally
You can pull the pre-built Docker image directly from Docker Hub and run it:
docker pull imsachin010/openenv-workflow-agent:latest
docker run -d -p 7860:7860 --name openenv-agent imsachin010/openenv-workflow-agent:latest
Test endpoint:
curl -X POST http://localhost:7860/reset
π€ Inference
Run the inference script inside the environment:
python -m inference
Outputs:
[START]
[STEP]
[END]
π§ Conclusion
This environment highlights a key gap in current agents:
β They do not reason about when to gather information
We provide a benchmark to evaluate and improve:
- decision-making under uncertainty
- information-seeking behavior
- sequential reasoning
π Submission Notes
- β Fully OpenEnv compliant
- β Deterministic graders
- β Reproducible via Docker
- β HF Space endpoint available