Spaces:

hjerpe
/

sql_env

Running

App Files Files Community

sql_env / specs /F001-CLARIFICATION_QUESTIONS.md

hjerpe's picture

Upload folder using huggingface_hub

5dd1bb4 verified 21 days ago

|

history blame contribute delete

4.15 kB

Clarification Questions: F001 - Core Environment Loop

Generated: 2026-03-24 Research Summary: specs/F001-RESEARCH_SUMMARY.md Status: Answered

Questions

#	Category	Question	Default Assumption	Impact if Wrong	Answer
1	Dependencies	Research found no .sqlite database files anywhere in the repo, and `download_spider_data.py` only downloads question JSON (not databases). The ORM models in `data/databases/models.py` define the schema but no data exists. Should we generate the SQLite database from ORM models + seed with synthetic data, or download the actual Spider SQLite databases from HuggingFace?	Generate from ORM models using `Base.metadata.create_all()` and seed with minimal synthetic data (enough for 53 questions to produce results). This avoids a new download dependency and keeps the repo self-contained.	High	Download the actual Spider SQLite databases. Synthetic data won't match gold SQL answers. Synthetic data generation saved as a separate future feature for robustness/metamorphic testing.
2	Scope	Research found that `SQLObservation` currently carries only `messages` and `tokens`, while the v1 spec (Section 2.2) and the commented-out fields in `models.py` (lines 88-103) define rich fields: `question`, `schema_info`, `result`, `error`, `step_count`, `budget_remaining`, `action_history`. Should F001 uncomment and populate the rich observation fields, or continue with messages-only?	Uncomment and populate the rich observation fields. This is what the v1 spec defines and what an RL agent needs for clean state representation. Keep `messages` and `tokens` as well for backward compatibility.	High	Yes, uncomment and populate rich observation fields. This matches the v1 spec and is what the reward system needs.
3	Scope	Research found that `SQLAction.action_description` is currently used for NL text (e.g., "show students table"), but the v1 spec (Section 2.2) defines a separate `argument` field for structured input (table name or SQL string). Should we add an `argument` field to SQLAction, or repurpose `action_description` as the structured argument?	Repurpose `action_description` as the structured argument (table name for DESCRIBE/SAMPLE, SQL for QUERY, answer value for ANSWER). This avoids breaking the Pydantic model schema and the client serialization. Rename to `argument` only if a clean break is acceptable.	Medium	Using `action_description` for structured data is semantically confusing but functionally correct. Choosing wrong means either a confusing API (if we keep the name) or a breaking change to client + tests (if we rename). Contained rework either way.
4	Scope	Research found `message_to_action()` and `_detect_action_type()` implement NL keyword-based action detection (lines 455-545). With structured actions, the agent sends `action_type` directly. These methods also append messages to history and tokenize -- tightly coupling NL parsing with state management. Should we remove/deprecate these methods, or keep them as an alternative input path?	Remove `_detect_action_type()` entirely. Refactor `message_to_action()` to be a thin adapter that extracts structured fields from the message without NL keyword detection, if OpenEnv requires this method. If OpenEnv does not require it, remove it too.	Low	This is purely about internal code hygiene. The structured action path works regardless of whether these methods exist. Easily changed in a follow-up.

Categories

Scope: What's in/out of the feature boundary
Constraints: Technical, performance, or compatibility limits
Edge Cases: Unusual inputs or states that need handling
Priorities: What to optimize for when trade-offs arise
Dependencies: External systems, libraries, or features required

Instructions for Human

Answer any questions where the default assumption does not match your intent
Leave blank to accept the default assumption
Type "skip" to skip all questions and proceed with all defaults