matsuo-llm-advanced-phase-f1a

Fine-tuned from Qwen/Qwen2.5-7B-Instruct for agent tasks.

Training Configuration

  • LoRA: r=8, alpha=16 (Phase D identical)
  • lr: 1e-5, epochs: 0.3, batch: 4ร—4=16
  • Data: Spider/BIRD 70% + DBBench v4 20% (weak category weighted) + ALFWorld 10%, 3500 samples

Data Changes from Phase D

  • Spider/BIRD: identical to Phase D (2450 random samples)
  • ALFWorld v5: identical to Phase D (350 random samples)
  • DBBench v4: resampled 700 from 1200 with weak category emphasis
    • Prioritized: INSERT, COUNT, AGG-MAX, AGG-SUM (weak in E6-a evaluation)
    • Category selection is rule-based (keyword matching), not LLM-based

Datasets

  • u-10bei/dbbench_sft_dataset_react_v4 โ€” Listed in the organizer-shared Phase B dataset list. Used as provided (no modification). Third-party synthetic SFT for DBBench format alignment; all tables, data, and queries are independently generated (per dataset description: "to avoid test data leakage").
  • xlangai/spider โ€” CC BY-SA 4.0 (Yale/Columbia Spider project)
  • birdsql/bird_mini_dev โ€” CC BY-SA 4.0 (HKU)
  • Official Phase B ALFWorld v5 dataset โ€” Organizer-provided, used as provided.

Compliance

  • Evaluation data not used in training: No analysis of evaluation test data was conducted.
  • LLM was not used for data quality filtering or selection.
  • Category classification is rule-based (SQL keyword matching: INSERT, COUNT, MAX, SUM, etc.)
  • Inference code not modified.

Usage

Compatible with vLLM v0.13.0+.

Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for astom-M/matsuo-llm-advanced-phase-f1a

Base model

Qwen/Qwen2.5-7B
Finetuned
(3214)
this model