οΌγθͺ²ι‘γγγγ―θͺεγ§θ¨ε ₯γγ¦δΈγγοΌ
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen2.5-3B-Instruct.
Training Data
- HF dataset id: (not set)
- Local dataset path: out_dagger_alfworld_replay/iter_001/aggregate_messages_all.jsonl
Training Configuration
- Max sequence length: 1024
- Epochs: 1
- Learning rate: 2e-04
- LoRA: r=16, alpha=32
Notes
- Upload source adapter is expected to be the model trained after ALFWorld DAgger replay.
- Downloads last month
- 13