Chattso-GPT commited on
Commit
31e5730
·
verified ·
1 Parent(s): ea01dc4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -15,7 +15,7 @@ tags:
15
  - dbbench
16
  ---
17
 
18
- # <【課題】ここは自分で記入して下さい>
19
 
20
  This repository provides a **LoRA adapter** fine-tuned from
21
  **Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
@@ -26,11 +26,11 @@ The base model must be loaded separately.
26
  ## Training Objective
27
 
28
  This adapter is trained to improve **multi-turn agent task performance**
29
- on ALFWorld (household tasks) and DBBench (database operations).
30
 
31
  Loss is applied to **all assistant turns** in the multi-turn trajectory,
32
  enabling the model to learn environment observation, action selection,
33
- tool use, and recovery from errors.
34
 
35
  ## Training Configuration
36
 
@@ -49,7 +49,7 @@ from peft import PeftModel
49
  import torch
50
 
51
  base = "Qwen/Qwen3-4B-Instruct-2507"
52
- adapter = "your_id/your-repo"
53
 
54
  tokenizer = AutoTokenizer.from_pretrained(base)
55
  model = AutoModelForCausalLM.from_pretrained(
 
15
  - dbbench
16
  ---
17
 
18
+ # qwen3-4b-agent-trajectory-lora
19
 
20
  This repository provides a **LoRA adapter** fine-tuned from
21
  **Qwen/Qwen3-4B-Instruct-2507** using **LoRA + Unsloth**.
 
26
  ## Training Objective
27
 
28
  This adapter is trained to improve **multi-turn agent task performance**
29
+ on ALFWorld (household tasks).
30
 
31
  Loss is applied to **all assistant turns** in the multi-turn trajectory,
32
  enabling the model to learn environment observation, action selection,
33
+ and recovery from errors.
34
 
35
  ## Training Configuration
36
 
 
49
  import torch
50
 
51
  base = "Qwen/Qwen3-4B-Instruct-2507"
52
+ adapter = "Chattso-GPT/test105"
53
 
54
  tokenizer = AutoTokenizer.from_pretrained(base)
55
  model = AutoModelForCausalLM.from_pretrained(