οΌœγ€θͺ²ι‘Œγ€‘ここはθ‡ͺεˆ†γ§θ¨˜ε…₯γ—γ¦δΈ‹γ•γ„οΌž

This repository provides a LoRA adapter fine-tuned from Qwen/Qwen2.5-3B-Instruct.

Training Data

  • HF dataset id: (not set)
  • Local dataset path: out_dagger_alfworld_replay/iter_001/aggregate_messages_all.jsonl

Training Configuration

  • Max sequence length: 1024
  • Epochs: 1
  • Learning rate: 2e-04
  • LoRA: r=16, alpha=32

Notes

  • Upload source adapter is expected to be the model trained after ALFWorld DAgger replay.
Downloads last month
13
Safetensors
Model size
3B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for acomagu/matsuollm2025-advancedcompe-1

Base model

Qwen/Qwen2.5-3B
Adapter
(1126)
this model