System Behavior: synthetic-testing
Living document. Updated by
/archive-specwhen features are completed. Last archived: F008 on 2026-03-27
Synthetic Variant Generation
Variant database generation
The system accepts a SQLite database path and gold SQL query, then produces 1-2 variant databases with the same schema but different data. Each variant is stored in data/databases/variants/{db_name}/ and the original database is never modified.
Irrelevant row injection mutation
The system accepts a database copy and inserts rows with new primary key values that fall outside the gold SQL filter scope. The mutation produces rows that should not change the gold SQL result when the query is semantically correct.
ID remapping mutation
The system accepts a database copy and applies a bijective mapping to all integer primary keys, updating all referencing foreign keys to preserve relational integrity. Queries that hard-code specific ID values will return incorrect results on the remapped variant.
Bridge row duplication mutation
The system accepts a database copy and identifies bridge tables (tables with 2+ foreign key columns), then duplicates their rows. Queries missing DISTINCT will return inflated counts on the variant.
Gold SQL validation on variants
The system executes the gold SQL query on each generated variant and rejects any variant where the query returns an empty result set. Only variants producing valid, non-empty results are retained.
Synthetic generation CLI
The system accepts python -m server.synthetic --db-path <path> --gold-sql <sql> and produces variant databases, printing a summary to stdout. Returns exit code 0 if at least one valid variant is produced, exit code 1 otherwise.