openai/gpt-oss-20b Fine-tuned for NL2SQL++ v8

This model is a fine-tuned version of openai/gpt-oss-20b on the NL2SQL++ v8 dataset with code-with-thought reasoning.

Model Details

  • Base Model: openai/gpt-oss-20b
  • Task: Text-to-SQL generation
  • Dataset: NL2SQL++ v8 with code-with-thought reasoning
  • Fine-tuning Method: LoRA (Low-Rank Adaptation) with Unsloth
  • Quantization: 16-bit merged weights
  • Maximum Sequence Length: 16384 tokens
  • Training Dataset Size: 56212 examples
  • Validation Dataset Size: 1000 examples

Training Configuration

LoRA Parameters

  • LoRA Rank (r): 64
  • LoRA Alpha: 128
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Training Hyperparameters

  • Learning Rate: 0.0002
  • Training Epochs: 2
  • Max Steps: N/A (using epochs)
  • Train Batch Size: 64
  • Eval Batch Size: 10
  • Gradient Accumulation Steps: 2
  • Effective Batch Size: 128
  • Warmup Steps: 0
  • Warmup Ratio: 0.1
  • Optimizer: AdamW (torch)
  • Learning Rate Scheduler: Cosine
  • Weight Decay: 0.01
  • Max Gradient Norm: 1.0
  • Seed: 3407
  • Instruction Part: "<|start|>user<|message|>"
  • Response Part: "<|start|>assistant<|channel|>final<|message|>"

Train Dataset Example

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2026-01-05

Reasoning: low

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>
You are an expert in SQL++ query generation. You will be given a document schema and a natural language query. You need to generate a valid SQL++ query equivalent to the natural language query.

Bucket Name: `travel-sample`
Scope Name: inventory

Use the given document schema to generate the SQL++ query.

Document Schema:
{'route': {'properties': {'airline': {'samples': ['AM', 'B6'], 'type': 'string'}, 'airlineid': {'samples': ['airline_2009', 'airline_2638'], 'type': 'string'}, 'destinationairport': {'samples': ['ATL', 'IDA'], 'type': 'string'}, 'distance': {'samples': [303.2327637772, 1107.5683839502], 'type': 'number'}, 'equipment': {'samples': ['320', '738'], 'type': 'string'}, 'id': {'samples': [11138, 13958], 'type': 'number'}, 'schedule': {'items': {'properties': {'day': {'type': 'number'}, 'flight': {'type': 'string'}, 'utc': {'type': 'string'}}, 'type': 'object'}, 'samples': [[{'day': 0, 'flight': 'AM701', 'utc': '21:46:00'}], [{'day': 0, 'flight': 'B6285', 'utc': '16:12:00'}]], 'type': 'array'}, 'sourceairport': {'samples': ['BOS', 'IAH'], 'type': 'string'}, 'stops': {'samples': [0], 'type': 'number'}, 'type': {'samples': ['route'], 'type': 'string'}, '~meta': {'properties': {'id': {'samples': ['route_11138', 'route_13958'], 'type': 'string'}}, 'samples': [{'id': 'route_11138'}, {'id': 'route_13958'}], 'type': 'object'}}, 'type': 'object'}}

Natural Language Query
Which routes in the route collection rank first by the shortest distance within each destination airport when limited to seven results?

SQL++ Query:
<|end|><|start|>assistant<|channel|>analysis<|message|>I'll start generating SQL++ statement now.<|end|><|start|>assistant<|channel|>final<|message|>
```sql++
SELECT d.id, d.destinationairport, ROW_NUMBER() OVER (PARTITION BY d.destinationairport ORDER BY d.distance NULLS FIRST) AS `row` FROM route AS d LIMIT 7;

<|end|><|return|>


## Val Dataset Example

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI. Knowledge cutoff: 2024-06 Current date: 2026-01-05

Reasoning: low

Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>

You are an expert in SQL++ query generation. You will be given a document schema and a natural language query. You need to generate a valid SQL++ query equivalent to the natural language query.

Bucket Name: travel-sample Scope Name: inventory

Use the given document schema to generate the SQL++ query.

Document Schema: {'route': {'properties': {'airline': {'samples': ['AS', 'BA'], 'type': 'string'}, 'airlineid': {'samples': ['airline_1355', 'airline_1756'], 'type': 'string'}, 'destinationairport': {'samples': ['ATL', 'JFK'], 'type': 'string'}, 'distance': {'samples': [448.3541058305, 466.0724892866], 'type': 'number'}, 'equipment': {'samples': ['73H 73J', '744'], 'type': 'string'}, 'id': {'samples': [11761, 14501], 'type': 'number'}, 'schedule': {'items': {'properties': {'day': {'type': 'number'}, 'flight': {'type': 'string'}, 'utc': {'type': 'string'}}, 'type': 'object'}, 'samples': [[{'day': 0, 'flight': 'AS136', 'utc': '05:15:00'}], [{'day': 0, 'flight': 'BA803', 'utc': '22:55:00'}]], 'type': 'array'}, 'sourceairport': {'samples': ['DUS', 'GCM'], 'type': 'string'}, 'stops': {'samples': [0], 'type': 'number'}, 'type': {'samples': ['route'], 'type': 'string'}, '~meta': {'properties': {'id': {'samples': ['route_11761', 'route_14501'], 'type': 'string'}}, 'samples': [{'id': 'route_11761'}, {'id': 'route_14501'}], 'type': 'object'}}, 'type': 'object'}}

Natural Language Query Retrieve the route identifier and destination airport fields from the route collection, limited to seven documents.

SQL++ Query: <|end|><|start|>assistant<|channel|>final<|message|>

SELECT d.id, d.destinationairport FROM route AS d LIMIT 7;

<|end|><|return|>


Downloads last month
99
Safetensors
Model size
21B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jastorj/openai_gpt_oss_20b-nl2sqlpp-16bit-v4.0-cw-16K

Adapter
(161)
this model