Spaces:

mahammadaftab
/

AI_Executive_Assistant_Simulator

Sleeping

File size: 7,775 Bytes

---
title: AI Executive Assistant Simulator
emoji: 🤖
colorFrom: blue
colorTo: indigo
sdk: gradio
sdk_version: 6.13.0
python_version: '3.10'
app_file: ui/app.py
pinned: false
---

# 🤖 AI Executive Assistant Simulator

> **OpenEnv RL Environment** — An advanced reinforcement learning environment that simulates a smart executive assistant managing scheduling, inbox communication, and task prioritization.

[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://python.org)
[![OpenEnv](https://img.shields.io/badge/OpenEnv-compatible-green.svg)](https://openenv.ai)
[![Gradio](https://img.shields.io/badge/demo-Gradio-orange.svg)](https://gradio.app)

---

## 🔷 Problem

Modern professionals struggle with **scheduling overload**, **task prioritization**, and **communication management**. An average executive handles 50+ decisions daily — making this a rich environment for RL agents to learn optimal strategies.

## 🔷 Solution

An RL-powered executive assistant built on the **OpenEnv** framework that:

- 📅 **Manages schedules** with temporal reasoning and overlap detection
- ⚡ **Resolves conflicts** using conflict graph modeling
- 📬 **Handles messages** with urgency-aware prioritization
- 🧠 **Learns personalized strategies** through user preference modeling
- 📈 **Improves via curriculum learning** from easy → hard scenarios

---

## 🧠 Advanced Features

| Feature | Description |
|---------|-------------|
| 🕐 **Temporal Reasoning** | Duration-aware time slots with overlap detection |
| 🎯 **Multi-Objective Rewards** | 5 reward components: task, schedule, message, efficiency, preferences |
| 👤 **User Preferences** | Personalization memory (preferred times, focus hours, meeting limits) |
| 👁️ **Partial Observability** | Hidden tasks & delayed inbox revealed progressively |
| 🚫 **Action Masking** | Invalid action prevention — agents only see legal moves |
| 🔗 **Conflict Graph** | Graph-based modeling of scheduling conflicts |
| 📚 **Curriculum Learning** | Auto-scaling difficulty: easy → medium → hard |
| 📊 **Metrics Tracking** | Completion rate, efficiency score, conflict count, response rate |
| 📅 **Gantt Timeline** | Interactive Plotly visualization of the schedule |

---

## 📁 Project Structure

```
ai-executive-assistant-openenv/
│
├── openenv.yaml              # OpenEnv environment manifest
├── README.md                 # This file
├── requirements.txt          # Python dependencies
│
├── env/                      # Core environment
│   ├── assistant_env.py      # Main env class (OpenEnv entry point)
│   ├── state.py              # State representation + partial observability
│   ├── actions.py            # Action definitions + action masking
│   ├── rewards.py            # Multi-objective reward engine
│   ├── scheduler.py          # Temporal reasoning + conflict resolution
│   ├── scenario_generator.py # Curriculum-aware scenario generation
│   └── utils.py              # Time utilities, conflict detection, metrics
│
├── agents/                   # Agent implementations
│   ├── random_agent.py       # Random baseline (lower bound)
│   ├── rule_based_agent.py   # Priority heuristic (strong baseline)
│   └── rl_agent.py           # Tabular Q-learning agent
│
├── training/                 # Training & evaluation
│   ├── train_rl.py           # Multi-agent training comparison
│   ├── evaluate.py           # Evaluation harness
│   └── plots.py              # Visualization utilities
│
├── ui/                       # Interactive demo
│   ├── app.py                # Gradio web interface
│   └── timeline.py           # Plotly Gantt timeline
│
└── logs/                     # Training outputs
    ├── reward_curves.png
    ├── agent_comparison.png
    └── rl_metrics.png
```

---

## 🚀 Quick Start

### Installation

```bash
pip install -r requirements.txt
```

### Run Training

```bash
python -m training.train_rl
```

This trains all 3 agents (Random, Rule-Based, Q-Learning) for 200 episodes and generates comparison plots in `logs/`.

### Run Evaluation

```bash
python -m training.evaluate
```

### Launch Interactive Demo

```bash
python -m ui.app
```

Then open `http://localhost:7860` in your browser.

---

## 🎮 Environment API

```python
from env.assistant_env import ExecutiveAssistantEnv

env = ExecutiveAssistantEnv(difficulty="medium", max_steps=50)

state = env.reset()
print(state["tasks"])      # List of task objects
print(state["inbox"])       # List of inbox messages
print(state["valid_actions"])  # Legal actions (action masking)

# Take a step
action = ("complete_task", 0)  # Complete task with ID 0
next_state, reward, done, info = env.step(action)
```

### Action Space

| Action | Description |
|--------|-------------|
| `schedule_task` | Schedule a pending task into a time slot |
| `complete_task` | Mark a task as completed |
| `defer_task` | Postpone a task to a later time |
| `send_reply` | Reply to an inbox message |
| `reject_task` | Cancel a task |
| `ask_clarification` | Request more info about a task/message |

### Observation Space

```json
{
  "time": "09:30",
  "tasks": [
    {"id": 0, "title": "Q4 Strategy Review", "time": "10:00",
     "duration": 60, "priority": "high", "type": "meeting", "status": "pending"}
  ],
  "inbox": [
    {"id": 0, "sender": "CEO", "content": "Need figures ASAP",
     "urgency": "high", "replied": false}
  ],
  "preferences": {"preferred_meeting_times": ["09:00", "14:00"], ...},
  "valid_actions": [("complete_task", 0), ("send_reply", 0), ...],
  "action_mask": [1, 1, 1, 1, 1, 1]
}
```

---

## 📈 Results

| Agent | Avg Reward | Task Completion | Message Response | Efficiency |
|-------|-----------|----------------|------------------|------------|
| 🎲 Random | Low | ~30% | ~25% | ~25/100 |
| 📋 Rule-Based | Medium | ~65% | ~70% | ~55/100 |
| 🧠 Q-Learning | High | ~75% | ~80% | ~70/100 |

*Results vary by difficulty and random seed.*

---

## 🏗️ System Architecture

```
User / RL Agent
       │
       ▼
┌──────────────────────┐
│  ExecutiveAssistantEnv │
│  ┌─────────────────┐ │
│  │ ScenarioGenerator│ │ ← Curriculum Learning
│  └─────┬───────────┘ │
│  ┌─────▼───────────┐ │
│  │     State        │ │ ← Partial Observability
│  │ (tasks + inbox)  │ │
│  └─────┬───────────┘ │
│  ┌─────▼───────────┐ │
│  │   Scheduler      │ │ ← Temporal Reasoning
│  │ (conflict graph) │ │
│  └─────┬───────────┘ │
│  ┌─────▼───────────┐ │
│  │  RewardEngine    │ │ ← Multi-Objective Shaping
│  │ (5 components)   │ │
│  └─────────────────┘ │
│  ┌─────────────────┐ │
│  │  Action Masking  │ │ ← Invalid Action Prevention
│  └─────────────────┘ │
└──────────────────────┘
       │
       ▼
   Observation + Reward + Done
```

---

## 📜 License

MIT License

---

## 🙏 Acknowledgments

- Built for the **OpenEnv** platform
- Inspired by real-world executive assistant workflows
- Visualization powered by **Plotly** and **Gradio**