A newer version of the Gradio SDK is available: 6.14.0
title: AI Executive Assistant Simulator
emoji: ๐ค
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.13.0
python_version: '3.10'
app_file: ui/app.py
pinned: false
๐ค AI Executive Assistant Simulator
OpenEnv RL Environment โ An advanced reinforcement learning environment that simulates a smart executive assistant managing scheduling, inbox communication, and task prioritization.
๐ท Problem
Modern professionals struggle with scheduling overload, task prioritization, and communication management. An average executive handles 50+ decisions daily โ making this a rich environment for RL agents to learn optimal strategies.
๐ท Solution
An RL-powered executive assistant built on the OpenEnv framework that:
- ๐ Manages schedules with temporal reasoning and overlap detection
- โก Resolves conflicts using conflict graph modeling
- ๐ฌ Handles messages with urgency-aware prioritization
- ๐ง Learns personalized strategies through user preference modeling
- ๐ Improves via curriculum learning from easy โ hard scenarios
๐ง Advanced Features
| Feature | Description |
|---|---|
| ๐ Temporal Reasoning | Duration-aware time slots with overlap detection |
| ๐ฏ Multi-Objective Rewards | 5 reward components: task, schedule, message, efficiency, preferences |
| ๐ค User Preferences | Personalization memory (preferred times, focus hours, meeting limits) |
| ๐๏ธ Partial Observability | Hidden tasks & delayed inbox revealed progressively |
| ๐ซ Action Masking | Invalid action prevention โ agents only see legal moves |
| ๐ Conflict Graph | Graph-based modeling of scheduling conflicts |
| ๐ Curriculum Learning | Auto-scaling difficulty: easy โ medium โ hard |
| ๐ Metrics Tracking | Completion rate, efficiency score, conflict count, response rate |
| ๐ Gantt Timeline | Interactive Plotly visualization of the schedule |
๐ Project Structure
ai-executive-assistant-openenv/
โ
โโโ openenv.yaml # OpenEnv environment manifest
โโโ README.md # This file
โโโ requirements.txt # Python dependencies
โ
โโโ env/ # Core environment
โ โโโ assistant_env.py # Main env class (OpenEnv entry point)
โ โโโ state.py # State representation + partial observability
โ โโโ actions.py # Action definitions + action masking
โ โโโ rewards.py # Multi-objective reward engine
โ โโโ scheduler.py # Temporal reasoning + conflict resolution
โ โโโ scenario_generator.py # Curriculum-aware scenario generation
โ โโโ utils.py # Time utilities, conflict detection, metrics
โ
โโโ agents/ # Agent implementations
โ โโโ random_agent.py # Random baseline (lower bound)
โ โโโ rule_based_agent.py # Priority heuristic (strong baseline)
โ โโโ rl_agent.py # Tabular Q-learning agent
โ
โโโ training/ # Training & evaluation
โ โโโ train_rl.py # Multi-agent training comparison
โ โโโ evaluate.py # Evaluation harness
โ โโโ plots.py # Visualization utilities
โ
โโโ ui/ # Interactive demo
โ โโโ app.py # Gradio web interface
โ โโโ timeline.py # Plotly Gantt timeline
โ
โโโ logs/ # Training outputs
โโโ reward_curves.png
โโโ agent_comparison.png
โโโ rl_metrics.png
๐ Quick Start
Installation
pip install -r requirements.txt
Run Training
python -m training.train_rl
This trains all 3 agents (Random, Rule-Based, Q-Learning) for 200 episodes and generates comparison plots in logs/.
Run Evaluation
python -m training.evaluate
Launch Interactive Demo
python -m ui.app
Then open http://localhost:7860 in your browser.
๐ฎ Environment API
from env.assistant_env import ExecutiveAssistantEnv
env = ExecutiveAssistantEnv(difficulty="medium", max_steps=50)
state = env.reset()
print(state["tasks"]) # List of task objects
print(state["inbox"]) # List of inbox messages
print(state["valid_actions"]) # Legal actions (action masking)
# Take a step
action = ("complete_task", 0) # Complete task with ID 0
next_state, reward, done, info = env.step(action)
Action Space
| Action | Description |
|---|---|
schedule_task |
Schedule a pending task into a time slot |
complete_task |
Mark a task as completed |
defer_task |
Postpone a task to a later time |
send_reply |
Reply to an inbox message |
reject_task |
Cancel a task |
ask_clarification |
Request more info about a task/message |
Observation Space
{
"time": "09:30",
"tasks": [
{"id": 0, "title": "Q4 Strategy Review", "time": "10:00",
"duration": 60, "priority": "high", "type": "meeting", "status": "pending"}
],
"inbox": [
{"id": 0, "sender": "CEO", "content": "Need figures ASAP",
"urgency": "high", "replied": false}
],
"preferences": {"preferred_meeting_times": ["09:00", "14:00"], ...},
"valid_actions": [("complete_task", 0), ("send_reply", 0), ...],
"action_mask": [1, 1, 1, 1, 1, 1]
}
๐ Results
| Agent | Avg Reward | Task Completion | Message Response | Efficiency |
|---|---|---|---|---|
| ๐ฒ Random | Low | ~30% | ~25% | ~25/100 |
| ๐ Rule-Based | Medium | ~65% | ~70% | ~55/100 |
| ๐ง Q-Learning | High | ~75% | ~80% | ~70/100 |
Results vary by difficulty and random seed.
๐๏ธ System Architecture
User / RL Agent
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ ExecutiveAssistantEnv โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ ScenarioGeneratorโ โ โ Curriculum Learning
โ โโโโโโโฌโโโโโโโโโโโโ โ
โ โโโโโโโผโโโโโโโโโโโโ โ
โ โ State โ โ โ Partial Observability
โ โ (tasks + inbox) โ โ
โ โโโโโโโฌโโโโโโโโโโโโ โ
โ โโโโโโโผโโโโโโโโโโโโ โ
โ โ Scheduler โ โ โ Temporal Reasoning
โ โ (conflict graph) โ โ
โ โโโโโโโฌโโโโโโโโโโโโ โ
โ โโโโโโโผโโโโโโโโโโโโ โ
โ โ RewardEngine โ โ โ Multi-Objective Shaping
โ โ (5 components) โ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โโโโโโโโโโโโโโโโโโโ โ
โ โ Action Masking โ โ โ Invalid Action Prevention
โ โโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโ
โ
โผ
Observation + Reward + Done
๐ License
MIT License
๐ Acknowledgments
- Built for the OpenEnv platform
- Inspired by real-world executive assistant workflows
- Visualization powered by Plotly and Gradio