exp-0216-005-db-balanced-qwen2.5-7b
Fine-tuned from Yano/exp-0212-001-alfworld-qwen2.5-7b (001 ALFWorld SFT model) using QLoRA (4-bit, Unsloth).
Purpose
DB Bench training with balanced data (v1-v4 mixed, INSERT/UPDATE downsampled). Addresses 004's SELECT degradation (76.5% -> 41.0%) caused by INSERT/UPDATE data imbalance.
Data Strategy
- Combined all DB Bench v1-v4 (3,060 total)
- Downsampled INSERT to 200, UPDATE to 150
- Kept all SELECT/query-type samples
- Final dataset: ~1470 samples (INSERT+UPDATE ~24.5%)
Training Configuration
- Base model: Yano/exp-0212-001-alfworld-qwen2.5-7b
- Method: QLoRA (4-bit), merged to 16-bit
- Max sequence length: 2048
- Epochs: 3
- Learning rate: 2e-05
- LoRA: r=64, alpha=128
- Batch size: 2 (grad accum 16)
- Warmup ratio: 0.1
- Collator: AllAssistantTurnsCollator (all turns supervised)
- Downloads last month
- 15