matsuo-llm-advanced-phase-f2a
Fine-tuned from Qwen/Qwen2.5-7B-Instruct for agent tasks.
Training Configuration
- LoRA: r=8, alpha=16 (Phase D identical)
- lr: 1e-5, epochs: 0.3, batch: 4x4=16
- Data: Spider/BIRD 2100 (60%) + Distilled 72B 500 (14%) + DBBench v4 700 (20%) + ALFWorld 200 (6%), 3500 samples
Data Composition
- Spider/BIRD: 2100 samples (60%) - reduced from Phase D to accommodate distilled data
- Distilled 72B: 500 samples (14%) - NEW: multi-turn DB conversations generated by Qwen2.5-72B-Instruct-AWQ
- Generated from Spider/BIRD table schemas (not from AgentBench data)
- Weak category emphasis (AGG-MAX, COUNT, INSERT, AGG-SUM)
- Filtered by regex/sqlparse only (no LLM quality filtering)
- DBBench v4: 700 samples (20%) - same as Phase D
- ALFWorld v5: 200 samples (6%) - reduced from Phase D
Datasets
u-10bei/dbbench_sft_dataset_react_v4โ Listed in the organizer-shared Phase B dataset list. Used as provided (no modification). Third-party synthetic SFT for DBBench format alignment; all tables, data, and queries are independently generated (per dataset description: "to avoid test data leakage").xlangai/spiderโ CC BY-SA 4.0 (Yale/Columbia Spider project)birdsql/bird_mini_devโ CC BY-SA 4.0 (HKU)- Official Phase B ALFWorld v5 dataset โ Organizer-provided, used as provided.
- Synthetic multi-turn data generated by Qwen2.5-72B-Instruct-AWQ (whitelisted model)
Compliance
- Evaluation data not used in training: No analysis of evaluation test data was conducted.
- LLM was not used for data quality filtering or selection.
- Distilled data generated from whitelisted model (Qwen2.5-72B-Instruct-AWQ) using Spider/BIRD schemas only.
- Filtering is regex/sqlparse only (non-LLM).
- Inference code not modified.
Usage
Compatible with vLLM v0.13.0+.
- Downloads last month
- 3