Spaces:

hjerpe
/

sql_env

Running

App Files Files Community

sql_env / ONBOARDING_action_feature.md

hjerpe

Upload folder using huggingface_hub

9e64e71 verified 8 days ago

preview code

raw

history blame contribute delete

23.9 kB

Onboarding: `action-feature` Branch

What the action-feature branch adds compared to main. Last updated: 2026-02-28 Focus: Branch delta — new components, model changes, data flow, and gaps.

What This Branch Does

The action-feature branch transforms SQLEnv from a scaffold with well-designed Pydantic models into a partially working environment with real action dispatch (describe/sample/query), Ollama-based SQL generation, a WebSocket client, SQLAlchemy ORM models for the student_assessment database, and Spider question data. It implements the core message → action → step → observation loop that the RL training pipeline will eventually drive.

Branch Overview

┌─────────────────────────────────────────────────────────────────────┐
│  action-feature: New/Changed Components                            │
├─────────────────────────────────────────────────────────────────────┤
│                                                                     │
│   Training Code / Notebook                                          │
│   ┌──────────────────────┐                                          │
│   │  test_env.ipynb  NEW │   Interactive walkthrough (5 test cells) │
│   └──────────┬───────────┘                                          │
│              │ imports                                               │
│   ┌──────────▼───────────┐     ┌──────────────────────────┐         │
│   │  client.py       NEW │────▶│ models.py       CHANGED  │         │
│   │  SQLEnvClient        │     │ SQLAction  (+ tokens,    │         │
│   │  _step_payload()     │     │            action_desc)  │         │
│   │  _parse_result()     │     │ SQLObservation           │         │
│   │  _parse_state()      │     │   (messages + tokens)    │         │
│   │  message_to_action() │     │ SQLState                 │         │
│   └──────────────────────┘     │   (history + tokens)     │         │
│              │ WebSocket       └──────────────────────────┘         │
│   ┌──────────▼───────────┐                                          │
│   │  server/app.py  CHG  │   FastAPI bootstrap + tokenizer factory  │
│   │  create_sql_env()    │                                          │
│   └──────────┬───────────┘                                          │
│              │ creates                                               │
│   ┌──────────▼───────────────────────────────────────────────┐      │
│   │  server/sql_environment.py                           NEW │      │
│   │  SQLEnvironment(Environment)                             │      │
│   │  ├── reset()          → clear state, return obs          │      │
│   │  ├── step(action)     → dispatch on action_type          │      │
│   │  │   ├── "describe"   → Ollama selects table → ORM info │      │
│   │  │   ├── "sample"     → Ollama selects table → SQL gen  │      │
│   │  │   └── "query"      → Ollama generates SQL from NL    │      │
│   │  ├── message_to_action() → detect type, tokenize        │      │
│   │  └── _detect_action_type() → keyword classifier         │      │
│   └──────────┬───────────────────────┬───────────────────────┘      │
│              │ introspects           │ HTTP calls                    │
│   ┌──────────▼───────────┐  ┌───────▼────────────────┐             │
│   │  data/databases/     │  │  Ollama (external)     │             │
│   │  models.py       NEW │  │  /api/generate         │             │
│   │  9 SQLAlchemy tables │  │  qwen2 (default)       │             │
│   └──────────────────────┘  └────────────────────────┘             │
│                                                                     │
│   ┌──────────────────────┐  ┌────────────────────────────────┐      │
│   │  data/questions/     │  │  scripts/                  NEW │      │
│   │  student_assessment  │  │  download_spider_data.py       │      │
│   │  .json           NEW │  │  generate_models_from_schema.py│      │
│   │  (30+ Q&A pairs)    │  └────────────────────────────────┘      │
│   └──────────────────────┘                                          │
│                                                                     │
│   ┌──────────────────────┐  ┌────────────────────────┐              │
│   │ server/test_sql_env  │  │ server/install_deps.sh │              │
│   │ .py    MockTokenizer │  │ Docker setup       NEW │              │
│   │                  NEW │  └────────────────────────┘              │
│   └──────────────────────┘                                          │
└─────────────────────────────────────────────────────────────────────┘

Files Changed/Added

File	Status	Purpose
`envs/sql_env/models.py`	Changed	Rewired `SQLAction`, `SQLObservation`, `SQLState` for message+token paradigm
`envs/sql_env/__init__.py`	Changed	Exports `SQLAction`, `SQLObservation`, `SQLState`; lazy client import
`envs/sql_env/client.py`	New	`SQLEnvClient(EnvClient)` — WebSocket client with tensor serialization
`envs/sql_env/server/sql_environment.py`	New	`SQLEnvironment(Environment)` — core environment logic (463 lines)
`envs/sql_env/server/app.py`	Changed	FastAPI bootstrap with tokenizer factory + MockTokenizer fallback
`envs/sql_env/server/__init__.py`	Changed	Exports `SQLEnvironment`
`envs/sql_env/server/test_sql_env.py`	New	`MockTokenizer` for testing without `transformers` library
`envs/sql_env/server/install_deps.sh`	New	Docker setup script: pip install + pre-download GPT-2 tokenizer
`envs/sql_env/server/requirements.txt`	New	Server-side pip deps for Docker (fastapi, torch, transformers, etc.)
`envs/sql_env/data/databases/models.py`	New	SQLAlchemy ORM for `student_assessment` DB (9 model classes)
`envs/sql_env/data/questions/student_assessment.json`	New	30+ Spider questions with gold SQL, tokenized queries
`envs/sql_env/scripts/download_spider_data.py`	New	Downloads Spider questions from HuggingFace by `db_id`
`envs/sql_env/scripts/generate_models_from_schema.py`	New	Auto-generates SQLAlchemy models from Spider schema dataset
`envs/sql_env/pyproject.toml`	Changed	Python constrained to `>=3.11,<3.13`; added `requests>=2.31.0`
`envs/sql_env/uv.lock`	Changed	Lock file updated for new dependencies
`README.md`	Changed	Added "Current Package State" section with pinned dependency rationale
`envs/sql_env/server/environment.py`	Emptied	Replaced by `sql_environment.py`
`test_env.ipynb`	New	Jupyter notebook with 5 interactive test scenarios

Total: 18 files changed, +5702 / -412 lines.

Key Components Introduced

1. `SQLEnvironment` — `envs/sql_env/server/sql_environment.py`

The heart of the branch. Implements the OpenEnv Environment interface with three action types:

Action Type	Dispatch Flow	Output
`describe`	Ollama selects table → `_get_table_schema()` introspects SQLAlchemy model	Column names + natural language types
`sample`	Ollama selects table → `_generate_sample_query()`	`SELECT * FROM <table> LIMIT 10;`
`query`	`_call_ollama_for_sql()` sends NL + schema to Ollama	Generated SQL string

Key methods:

reset() — Clears conversation history, re-initializes system prompt message + tokens. Returns initial SQLObservation.
step(action) — Dispatches on action.action_type. Appends assistant response to history_messages, stores action tokens in history_tokens. Returns flattened observation.
message_to_action(message) — Server-side conversion of Message dict → SQLAction. Detects action type via keywords, appends message to state history, tokenizes full conversation.
_detect_action_type(content) — Keyword classifier: checks for "describe"/"schema"/"columns" → describe, "sample"/"example"/"rows" → sample, default → query.
_create_observation() — Builds SQLObservation from current state. Flattens all history_tokens into a single 1D tensor via torch.cat.
_get_table_schema(table_name) — Introspects SQLAlchemy model columns, converts types to natural language.
_call_ollama_for_sql(query) / _call_ollama_to_select_table(request) — HTTP POST to Ollama /api/generate.

Constructor params: tokenizer (must have apply_chat_template), optional system_prompt, optional transform.

Environment variables: OLLAMA_MODEL (default: qwen2), OLLAMA_BASE_URL (default: http://localhost:11434).

2. `SQLEnvClient` — `envs/sql_env/client.py`

WebSocket client extending OpenEnv's EnvClient[SQLAction, SQLObservation, SQLState]. Handles tensor↔list serialization for JSON transport:

_step_payload(action) — Converts action.tokens (Tensor) to Python list for JSON.
_parse_result(payload) — Deserializes response → StepResult[SQLObservation], converting token lists back to tensors.
_parse_state(payload) — Deserializes state → SQLState with tensor reconstruction.
message_to_action(message, tokenizer, history_messages) — Client-side version of action creation (mirrors server logic). Requires passing a tokenizer explicitly.

3. `server/app.py` — FastAPI Bootstrap

Changed from a stub to a working application:

get_tokenizer() — Loads HuggingFace tokenizer from TOKENIZER_NAME env var (default: mistralai/Mistral-7B-Instruct-v0.1). Falls back to MockTokenizer from test_sql_env.py if transformers is not installed.
create_sql_environment() — Factory function creating SQLEnvironment per WebSocket session.
app = create_app(create_sql_environment, SQLAction, SQLObservation, env_name="sql_env") — Wires up WebSocket endpoints.

4. SQLAlchemy ORM — `envs/sql_env/data/databases/models.py`

9 model classes for the student_assessment database:

Model	Table	Key Columns
`Address`	Addresses	address_id, line_1, city, country
`Person`	People	person_id, first_name, last_name, email_address
`Student`	Students	student_id, student_details
`Course`	Courses	course_id (String PK), course_name
`PersonAddress`	People_Addresses	person_id (FK), address_id (FK), date_from/to
`StudentCourseRegistration`	Student_Course_Registrations	student_id (FK), course_id (FK), registration_date
`StudentCourseAttendance`	Student_Course_Attendance	student_id (FK), course_id (FK), date_of_attendance
`Candidate`	Candidates	candidate_id, candidate_details
`CandidateAssessment`	Candidate_Assessments	candidate_id (FK), qualification, assessment_date

All models include proper foreign key relationships with back_populates.

5. Spider Question Data — `envs/sql_env/data/questions/student_assessment.json`

3,355-line JSON file containing 30+ question-answer pairs from the Spider dataset. Each entry includes:

db_id — always student_assessment
question — natural language question (e.g., "which course has most number of registered students?")
query — gold SQL (e.g., SELECT T1.course_name FROM courses AS T1 JOIN student_course_registrations...)
query_toks / query_toks_no_value / question_toks — tokenized versions

6. Data Preparation Scripts — `envs/sql_env/scripts/`

download_spider_data.py — CLI tool to download Spider questions from HuggingFace. Supports --db-id filtering and --split selection.
generate_models_from_schema.py — Auto-generates SQLAlchemy ORM models from the richardr1126/spider-schema HuggingFace dataset. Maps Spider types to SQLAlchemy types, handles foreign keys.

7. `MockTokenizer` — `envs/sql_env/server/test_sql_env.py`

Deterministic tokenizer for testing without transformers:

apply_chat_template() — Converts message text to token IDs via ord(c) % 256.
decode() — Reverses the encoding back to characters.
Imported by app.py as a fallback when transformers is not installed.

Model Changes (Main → Action-Feature)

`SQLAction`

Field	Main	Action-Feature	Notes
`action_type`	`"DESCRIBE, SAMPLE, QUERY, ANSWER"`	`"describe, sample, query"`	Lowercase, ANSWER removed
`argument`	Table name / SQL / answer value	Removed	—
`action_description`	—	Added: description string	Replaces `argument`
`tokens`	—	Added: `torch.Tensor`	Tokenized conversation

`SQLObservation`

Field	Main	Action-Feature	Notes
`question`	NL question string	Commented out	—
`schema_info`	DB schema description	Commented out	—
`result`	Last action result	Commented out	—
`error`	Error message	Commented out	—
`step_count`	Current step number	Commented out	—
`budget_remaining`	Steps left	Commented out	—
`action_history`	Summary of actions	Commented out	—
`messages`	—	Added: `list[Message]`	Full conversation history
`tokens`	—	Added: `torch.Tensor`	Flattened token tensor

The original observation fields are commented out, not deleted — they're expected to return in a future phase.

`SQLState`

Field	Main	Action-Feature	Notes
`game_name`	`"sql_env"`	Commented out	—
`history_messages`	—	Added: `list[Message]`	Full conversation history
`history_tokens`	—	Added: `list[torch.Tensor]`	Per-message token tensors
`current_action_type`	—	Added: `str` (default `"query"`)	Tracks current action

Design shift: The branch moves from a structured observation (question + schema + result fields) to a chat-based observation (raw messages + tokens). This aligns with how LLM-based agents naturally consume conversational context.

Data Flow

User Message (dict: {role: "user", content: "Show me the Student schema"})
    │
    ▼
message_to_action(message)                     [SQLEnvironment or SQLEnvClient]
    ├── Detect action type via keywords
    │   "schema" found → action_type = "describe"
    ├── Append message to _state.history_messages     ← MUTATES STATE
    ├── Tokenize FULL conversation via tokenizer.apply_chat_template()
    └── Return SQLAction(action_type="describe",
    │                     action_description="Show me the Student schema",
    │                     tokens=<tensor>)
    │
    ▼
step(action)                                    [SQLEnvironment]
    ├── Dispatch on action.action_type:
    │   "describe" → _call_ollama_to_select_table("Show me the Student schema")
    │                → returns "Student"
    │                → _get_table_schema("Student")
    │                → introspects SQLAlchemy model columns
    │                → "Table 'Student' has: student_id: integer, ..."
    ├── Create assistant Message with schema info
    ├── Append assistant message to _state.history_messages
    ├── Append action.tokens to _state.history_tokens
    └── _create_observation()
        ├── Flatten all history_tokens via torch.cat → single 1D tensor
        ├── Copy history_messages
        ├── Apply transform (if configured)
        └── Return SQLObservation(messages=[...], tokens=<flat tensor>)

External Dependencies Added

Dependency	Version	Purpose	Integration Point
Ollama (local service)	—	LLM inference for SQL generation + table selection	`sql_environment.py:_call_ollama_for_sql()`, `_call_ollama_to_select_table()`
`requests`	>=2.31.0	HTTP client for Ollama API	`sql_environment.py`
`torch`	==2.2.2	Tensor operations for tokenized representations	`models.py`, `client.py`, `sql_environment.py`
`transformers`	<5	HuggingFace tokenizers (chat template support)	`app.py:get_tokenizer()`
`numpy`	<2	Torch dependency constraint	`pyproject.toml`
`sqlalchemy`	(transitive)	ORM for database schema introspection	`data/databases/models.py`
`datasets`	(scripts only)	HuggingFace `load_dataset` for Spider data download	`scripts/download_spider_data.py`, `scripts/generate_models_from_schema.py`

Environment variables:

Variable	Default	Purpose
`TOKENIZER_NAME`	`mistralai/Mistral-7B-Instruct-v0.1`	HuggingFace tokenizer model
`SYSTEM_PROMPT`	Built-in schema description	Custom system prompt override
`OLLAMA_MODEL`	`qwen2`	Ollama model for SQL generation
`OLLAMA_BASE_URL`	`http://localhost:11434`	Ollama API endpoint

Known Gaps (Not Yet Implemented)

Feature	Status	Notes
`ANSWER` action type	Not implemented	Designed in main-branch models but removed from action-feature
Real database execution	Not implemented	`step()` generates SQL text via Ollama but never executes it against SQLite
Reward computation	Not implemented	`reward.py` is empty; 3-layer design exists in README only
Answer verification	Not implemented	`verifier.py` is empty
Budget tracking	Not implemented	No step limit enforcement
Episode question selection	Not implemented	Environment uses hardcoded schema; `student_assessment.json` is present but not loaded by the environment
Dockerfile	Not implemented	File is empty; `install_deps.sh` is ready
`openenv.yaml` manifest	Not implemented	Empty file
Formal test suite	Not implemented	No `tests/` directory; only `MockTokenizer` and notebook tests

Gotchas

message_to_action() mutates state: On the server side, message_to_action() appends the message to _state.history_messages before tokenizing. This means calling it has a side effect — it's not a pure function. If you call it twice with the same message, you'll get duplicate entries in history.
Client vs Server message_to_action diverge: The server version (sql_environment.py:message_to_action) manages state internally and mutates _state. The client version (client.py:message_to_action) requires passing history_messages explicitly and does not manage state. They have different signatures.
Schema description is hardcoded in sql_environment.py: The _build_schema_description() function returns a fixed string with table/column names that don't perfectly match the SQLAlchemy ORM models. For example, the schema description says Students (student_id, person_id, student_acc_status) but the ORM model has Students (student_id, student_details).
Ollama failure mode is silent: If Ollama is unreachable, _call_ollama_to_select_table() catches all exceptions and returns the first table in the dict (Address). No error is surfaced to the caller. _call_ollama_for_sql() returns an error string, but it's treated as a normal assistant message.
Original observation fields are commented out, not deleted: SQLObservation still has question, schema_info, result, error, step_count, budget_remaining, and action_history as comments. They're intended to return in a later phase.
MockTokenizer is imported by production code: app.py imports MockTokenizer from test_sql_env.py at runtime when transformers is missing. This couples test utilities to production bootstrap.
test_env.ipynb lives at project root: Not inside tests/ or envs/. Easy to miss when exploring the codebase.
Pydantic + torch.Tensor: SQLAction, SQLObservation, and SQLState use torch.Tensor fields with Pydantic. This requires arbitrary_types_allowed = True in the Pydantic model config (inherited from OpenEnv base classes). Standard Pydantic serialization (.model_dump()) won't work out of the box with tensors.

Entry Points for Reading

What You Want to Understand	Start Here	Then Read
How actions are processed	`envs/sql_env/server/sql_environment.py:step()`	`_detect_action_type()`, `_call_ollama_for_sql()`
How messages become actions	`envs/sql_env/server/sql_environment.py:message_to_action()`	`envs/sql_env/client.py:message_to_action()`
Data contracts	`envs/sql_env/models.py`	Compare with `git show main:envs/sql_env/models.py`
Server bootstrap	`envs/sql_env/server/app.py`	`get_tokenizer()`, `create_sql_environment()`
Database schema	`envs/sql_env/data/databases/models.py`	`envs/sql_env/data/questions/student_assessment.json`
Client-side usage	`envs/sql_env/client.py`	`test_env.ipynb`
Data preparation	`envs/sql_env/scripts/download_spider_data.py`	`scripts/generate_models_from_schema.py`

This document covers only the action-feature branch delta. For the overall project design (POMDP architecture, reward layers, episode lifecycle), see README.md.

These issues are also changed as of now, check when we modify. Known Issues Discovered

sqlalchemy is missing from pyproject.toml on the branch
Pydantic/TypedDict incompatibility on Python < 3.12 (demo auto-patches)
Hardcoded schema description in sql_environment.py doesn't match ORM models
Silent Ollama fallback to first table on connection failure

Please check the latest remote branch action-feature

Onboarding: action-feature Branch

What This Branch Does

Branch Overview

Files Changed/Added

Key Components Introduced

1. SQLEnvironment — envs/sql_env/server/sql_environment.py

2. SQLEnvClient — envs/sql_env/client.py

3. server/app.py — FastAPI Bootstrap

4. SQLAlchemy ORM — envs/sql_env/data/databases/models.py

5. Spider Question Data — envs/sql_env/data/questions/student_assessment.json

6. Data Preparation Scripts — envs/sql_env/scripts/

7. MockTokenizer — envs/sql_env/server/test_sql_env.py