# Feature Demo: `action-feature` — Core Action Dispatch System

> **Generated:** 2026-02-28T14:46:19+01:00
> **Branch:** `origin/action-feature` vs `main`
> **Project:** sql-env-onboarding (SQLEnv RL Environment for OpenEnv Challenge)
> **Execution environment:** Python 3.11, torch 2.2.2, MockTokenizer (no Ollama required)

---

## What This Feature Does

The `action-feature` branch adds the **core action dispatch system** to SQLEnv — the RL environment where AI agents learn interactive SQL exploration. Before this branch, the environment had data models but no way to actually process agent actions.

Now an agent can send natural language messages like _"describe the students table"_ or _"find all students enrolled in CS101"_, and the environment automatically classifies them into one of three action types (**describe**, **sample**, **query**), dispatches them to the appropriate handler, and returns observations with tokenized conversation history. This is the fundamental interaction loop that makes the environment usable for reinforcement learning.

The system works in two modes:
- **Mock path** (no external services): Uses `MockTokenizer` for tokenization. Describe and sample actions work fully; query actions return a graceful error since Ollama is unavailable.
- **Ollama path** (full pipeline): Uses a local Ollama LLM to select relevant tables for describe/sample and to generate SQL for query actions.

---

## Quickstart

```bash
# 1. Checkout the branch
git checkout origin/action-feature --detach

# 2. Install dependencies (from the sql_env package directory)
cd envs/sql_env/
uv sync
uv add sqlalchemy   # missing from pyproject.toml, needed for ORM models

# 3. Run the full demo (71 checks, ~2 seconds, no external services needed)
uv run python demo_action_feature.py

# 4. (Optional) Return to main when done
git checkout main
```

**Prerequisites:** Python 3.11+, `uv` package manager, git.
**Optional:** Ollama running locally with `llama3.2` model for full query generation (set `OLLAMA_MODEL=llama3.2`).

> **Note:** `sqlalchemy` is required by the ORM models but was omitted from `pyproject.toml` on the branch. The `uv add sqlalchemy` step is necessary.

> **Note:** On Python < 3.12, a Pydantic compatibility patch is needed because `openenv` defines `Message` with `typing.TypedDict` instead of `typing_extensions.TypedDict`. The demo script applies this patch automatically.

---

## Live Demo — Mock Path (Primary)

All output below was captured by executing `uv run python demo_action_feature.py` on the `action-feature` branch with no Ollama model configured (default `qwen2` not installed).

### 1. Environment Instantiation with MockTokenizer

The environment loads 9 SQLAlchemy ORM models (Address, Person, Student, Course, etc.) and initializes conversation state with a system prompt.

```bash
uv run python demo_action_feature.py
```

```
============================================================
  1. Environment Instantiation with MockTokenizer
============================================================
  [PASS] MockTokenizer created
  [PASS] SQLEnvironment created
  [PASS] System prompt stored
  [PASS] Tokenizer stored
  [PASS] 9 database models loaded
  [PASS] Initial state has 1 message (system)
  [PASS] Initial state has 1 token tensor
  [PASS] System message role is correct
  [PASS] System message content matches prompt
```

The environment correctly stores the custom system prompt, attaches the tokenizer, and loads all 9 database table models from SQLAlchemy.

### 2. reset() Returns Valid SQLObservation

```
============================================================
  2. reset() Returns Valid SQLObservation
============================================================
  [PASS] reset() returns SQLObservation
  [PASS] Observation has messages list
  [PASS] Messages contain system prompt
  [PASS] Observation has tokens tensor
  [PASS] Tokens tensor is 1D
  [PASS] Tokens are non-empty
  [PASS] done is False
  [PASS] reward is None

  Observation details:
    messages count: 1
    tokens shape:   torch.Size([29])
    tokens[:10]:    [89, 111, 117, 32, 97, 114, 101, 32, 97, 32]
```

After `reset()`, the observation contains one message (the system prompt) tokenized into a 1D tensor via `MockTokenizer` (char codes mod 256). The episode is not done and has no reward — ready for the agent's first action.

### 3. Action Type Detection

The keyword classifier maps user messages to three action types: `describe`, `sample`, and `query`.

```
============================================================
  3. Action Type Detection (_detect_action_type)
============================================================
  [PASS] 'describe the students table...' -> describe
  [PASS] 'what columns does Course have...' -> describe
  [PASS] 'show me the schema...' -> describe
  [PASS] 'show me sample rows from students...' -> sample
  [PASS] 'give me example data...' -> sample
  [PASS] 'how many rows are in Courses...' -> sample
  [PASS] 'find all students enrolled in CS101...' -> query
  [PASS] 'select count(*) from students...' -> query
  [PASS] 'what is the average score...' -> query
```

All 9 test cases correctly classified. Keywords like "describe"/"schema"/"columns" trigger describe; "sample"/"example"/"rows" trigger sample; everything else defaults to query.

### 4. message_to_action() Creates Properly Typed SQLAction

```
============================================================
  4. message_to_action() Creates Properly Typed SQLAction
============================================================
  [PASS] Returns SQLAction
  [PASS] action_type is 'describe'
  [PASS] action_description is message content
  [PASS] tokens is a torch.Tensor
  [PASS] tokens tensor is non-empty

  Action details:
    action_type:        describe
    action_description: describe the students table
    tokens shape:       torch.Size([1, 57])
  [PASS] message_to_action adds message to history
  [PASS] History[1] is the user message
  [PASS] Sample message -> action_type 'sample'
  [PASS] Query message -> action_type 'query'
```

`message_to_action()` converts a raw message dict into a `SQLAction` with the correct `action_type`, `action_description`, and tokenized tensor. **Important side effect:** it also appends the message to `_state.history_messages` before tokenizing, so the tokenizer sees the full conversation context.

### 5. step() with Describe Action

Without Ollama, the describe action falls back to the first table (Address) and returns its full SQLAlchemy-derived schema.

```
============================================================
  5. step() with Describe Action (Schema from SQLAlchemy Models)
============================================================
  [PASS] step() returns SQLObservation
  [PASS] History now has 3 messages (system + user + assistant)
  [PASS] Last message is from assistant
  [PASS] Assistant message contains 'columns'
  [PASS] Schema info contains column descriptions

  Describe response (first 200 chars):
    Table 'Address' has the following columns:

- address_id: integer number
- line_1: text (up to 255 characters)
- line_2: text (up to 255 characters)
- city: text (up to 255 characters)
- zip_postcode:
  [PASS] Tokens tensor grew after step
```

The schema is extracted directly from SQLAlchemy model introspection (column names, types converted to natural language). The observation now has 3 messages (system → user → assistant) and the token tensor grew.

### 6. step() with Sample Action

The sample action generates a `SELECT * ... LIMIT 10` query for the target table.

```
============================================================
  6. step() with Sample Action (Generates SQL Query Text)
============================================================
  [PASS] step(sample) returns assistant message
  [PASS] Sample response contains SELECT
  [PASS] Sample response contains LIMIT

  Sample response:
    Here's a query to sample data from Address:

SELECT * FROM Address LIMIT 10;
```

### 7. step() with Query Action (No Ollama)

Without Ollama, the query action returns a clear error message instead of crashing.

```
============================================================
  7. step() with Query Action (Mock Path — No Ollama)
============================================================
  [PASS] step(query) returns assistant message
  [PASS] Query response is error string (no Ollama) or SQL

  Query response (no Ollama):
    Error: Ollama returned status 404
```

The error is a graceful 404 (Ollama server is running but the default `qwen2` model isn't installed). The conversation continues normally — the error becomes part of the message history.

### 8. Multi-Turn Conversation State Management

Three turns of alternating user/assistant messages, verifying the conversation history grows correctly.

```
============================================================
  8. Multi-Turn Conversation State Management
============================================================
  [PASS] After turn 1: 3 messages (sys + user + assistant)
  [PASS] After turn 2: 5 messages (sys + u1 + a1 + u2 + a2)
  [PASS] Tokens grew between turns
  [PASS] After turn 3: 7 messages
  [PASS] Tokens grew again
  [PASS] Message roles follow expected pattern

  Conversation summary after 3 turns:
    [0]     system: You are a test SQL assistant....
    [1]       user: describe the Address table...
    [2]  assistant: Table 'Address' has the following columns:  - address_id: in...
    [3]       user: show me sample rows...
    [4]  assistant: Here's a query to sample data from Address:  SELECT * FROM A...
    [5]       user: find all addresses in New York...
    [6]  assistant: Error: Ollama returned status 404...
  Total tokens: 987
```

Message roles follow the expected `[system, user, assistant, user, assistant, user, assistant]` pattern. Token count grows monotonically: the `_create_observation()` method flattens all `history_tokens` into a single 1D tensor via `torch.cat`.

### 9. Client Serialization Roundtrip

The `SQLEnvClient` converts tensor → list for JSON transport and list → tensor on the return path.

```
============================================================
  9. Client Serialization Roundtrip (_step_payload)
============================================================
  [PASS] Payload has action_type
  [PASS] Payload has action_description
  [PASS] Tokens converted to list
  [PASS] Token values preserved
  [PASS] _parse_result returns StepResult
  [PASS] Observation messages parsed
  [PASS] Tokens converted back to tensor
  [PASS] Token values correct
  [PASS] Reward parsed

  Payload serialization:
    action_type: query
    tokens (list): [[1, 2, 3, 4, 5]]
```

---

## Edge Cases Exercised

### Empty System Prompt

When no system prompt is provided (empty string), the environment correctly starts with zero messages and an empty token tensor.

```
  [PASS] Empty system prompt -> no messages in history
  [PASS] Empty system prompt -> empty tokens
```

### Invalid Message Inputs

`message_to_action()` validates its input and raises `ValueError` for malformed messages.

```
  [PASS] Missing 'role' raises ValueError
  [PASS] Missing 'content' raises ValueError
  [PASS] None content raises ValueError
```

### Unknown Table Handling

Schema lookup and sample query generation gracefully handle non-existent tables.

```
  [PASS] Unknown table returns 'not found' message
  [PASS] Unknown table sample returns error
```

### MockTokenizer Encode/Decode Roundtrip

The mock tokenizer's `ord(c) % 256` encoding correctly roundtrips through `chr(t)` decoding.

```
  [PASS] MockTokenizer encode/decode roundtrip
```

### Invalid Tokenizer Validation

The environment constructor rejects tokenizers missing `apply_chat_template`.

```
  [PASS] Invalid tokenizer raises ValueError
```

---

## Live Demo — Ollama Path (Optional)

When Ollama is running locally with a compatible model, the query action generates real SQL and the describe action selects the correct table.

### Describe with Ollama

With `OLLAMA_MODEL=llama3.2`, the LLM correctly identifies "Student" as the most relevant table for "describe the students table":

```
DESCRIBE RESULT:
Table 'Student' has the following columns:

- student_id: integer number
- student_details: text (up to 255 characters)
```

Compare with mock path: fell back to "Address" (first table in dict). **With Ollama, table selection is intelligent.**

### Query with Ollama

The LLM generates valid SQL for natural language questions:

```
QUERY RESULT:
SELECT * FROM Students WHERE CourseID IN (SELECT CourseID FROM Courses WHERE CourseName = 'CS101')
```

> **Note:** The generated SQL references column names from the schema description prompt, not the actual SQLAlchemy model column names. This is expected — the LLM generates SQL based on the natural language schema it receives.

---

## Full Result Summary

```
============================================================
  SUMMARY
============================================================
  Total checks: 71
  Passed:       71
  Failed:       0

  ALL CHECKS PASSED
```

| Category | Checks | Passed | Failed |
|----------|--------|--------|--------|
| Imports | 1 | 1 | 0 |
| Instantiation | 8 | 8 | 0 |
| reset() | 8 | 8 | 0 |
| Action detection | 9 | 9 | 0 |
| message_to_action | 9 | 9 | 0 |
| step(describe) | 6 | 6 | 0 |
| step(sample) | 3 | 3 | 0 |
| step(query) | 2 | 2 | 0 |
| Multi-turn state | 6 | 6 | 0 |
| Client serialization | 9 | 9 | 0 |
| Edge cases | 9 | 9 | 0 |
| **Total** | **71** | **71** | **0** |

---

## Verification Checklist

- [x] Environment instantiation with MockTokenizer — 8 checks
- [x] `reset()` returns valid SQLObservation with system prompt — 8 checks
- [x] Action type detection for all 3 types (describe/sample/query) — 9 keywords tested
- [x] `message_to_action()` creates SQLAction with correct type and tokens — 9 checks
- [x] `step()` with describe returns schema from SQLAlchemy models — 6 checks
- [x] `step()` with sample returns SQL query text — 3 checks
- [x] `step()` with query returns Ollama error gracefully (mock path) — 2 checks
- [x] Multi-turn conversation state grows correctly — 6 checks
- [x] Client tensor↔list serialization roundtrip — 9 checks
- [x] Edge cases: empty prompt, invalid inputs, unknown tables, tokenizer validation — 9 checks

---

## Known Issues Found

1. **`sqlalchemy` missing from `pyproject.toml`** — The ORM models import `sqlalchemy` but it's not listed as a dependency. Must `uv add sqlalchemy` manually.

2. **Pydantic/TypedDict incompatibility on Python < 3.12** — The `openenv` library defines `Message` with `typing.TypedDict`, but Pydantic 2.x requires `typing_extensions.TypedDict`. The demo script monkey-patches this, but the issue would affect any direct usage.

3. **Ollama default model (`qwen2`) unlikely to be installed** — The default `OLLAMA_MODEL` is `qwen2`, which users probably don't have. The 404 error is graceful but confusing. Consider defaulting to `llama3.2` or documenting the required model.

4. **describe/sample fallback to first table** — When Ollama is unavailable, `_call_ollama_to_select_table()` silently falls back to the first table in the dict (`Address`). This is correct behavior but may confuse users expecting the table from their query.

---

## File Reference

| File | What it does |
|------|-------------|
| `envs/sql_env/demo_action_feature.py` | Executable demo script (71 checks) |
| `envs/sql_env/server/sql_environment.py` | Core `SQLEnvironment` with reset/step/dispatch |
| `envs/sql_env/models.py` | `SQLAction`, `SQLObservation`, `SQLState` Pydantic models |
| `envs/sql_env/client.py` | `SQLEnvClient` with tensor↔list serialization |
| `envs/sql_env/server/test_sql_env.py` | `MockTokenizer` (char ord encoding) |
| `envs/sql_env/data/databases/models.py` | 9 SQLAlchemy ORM models |

---

## How to Reproduce

```bash
git clone <repo-url>
cd sql-env-onboarding
git checkout origin/action-feature --detach
cd envs/sql_env/
uv sync && uv add sqlalchemy
uv run python demo_action_feature.py      # Mock path: 71/71 checks

# Optional: Ollama path
export OLLAMA_MODEL=llama3.2              # or any installed model
uv run python demo_action_feature.py      # Query actions now return real SQL
```

---

*Demo generated 2026-02-28. Re-run `uv run python demo_action_feature.py` on the action-feature branch to refresh.*