Jayant-Kernel commited on
fix: batch size 4 to match num_generations 4
Browse files
train.py
CHANGED
|
@@ -159,7 +159,7 @@ trainer = GRPOTrainer(
|
|
| 159 |
args=GRPOConfig(
|
| 160 |
output_dir="./deceit-1.5b",
|
| 161 |
max_steps=150,
|
| 162 |
-
per_device_train_batch_size=
|
| 163 |
num_generations=4,
|
| 164 |
learning_rate=5e-6,
|
| 165 |
warmup_steps=5,
|
|
|
|
| 159 |
args=GRPOConfig(
|
| 160 |
output_dir="./deceit-1.5b",
|
| 161 |
max_steps=150,
|
| 162 |
+
per_device_train_batch_size=4,
|
| 163 |
num_generations=4,
|
| 164 |
learning_rate=5e-6,
|
| 165 |
warmup_steps=5,
|