adithya9903
/

polyguard-openenv-training-3b-artifacts

Model card Files Files and versions

adithya9903 commited on 12 days ago

Commit

d31b81a

·

verified ·

1 Parent(s): 3b949a1

Upload PolyGuard training artifacts: outputs/plots

Files changed (18) hide show

.gitattributes +2 -0
outputs/plots/anti_cheat_failure_rates.png +0 -0
outputs/plots/avg_process_fidelity.png +0 -0
outputs/plots/avg_reward.png +0 -0
outputs/plots/grpo_reward_curves.png +3 -0
outputs/plots/inference_latency_validity.png +0 -0
outputs/plots/inference_validity_reward.png +0 -0
outputs/plots/legality_rate.png +0 -0
outputs/plots/policy_stack_avg_reward.png +0 -0
outputs/plots/qwen_model_grpo_reward.png +0 -0
outputs/plots/qwen_model_sft_loss.png +0 -0
outputs/plots/qwen_model_sft_reward.png +0 -0
outputs/plots/reward_component_bars.png +3 -0
outputs/plots/sft_loss_curves.png +0 -0
outputs/plots/sft_validity_reward.png +0 -0
outputs/plots/sft_vs_grpo_reward.png +0 -0
outputs/plots/success_rate.png +0 -0
outputs/plots/train_holdout_gap.png +0 -0

.gitattributes CHANGED Viewed

@@ -45,3 +45,5 @@ checkpoints/sweeps/qwen-qwen2-5-3b-instruct/merged/tokenizer.json filter=lfs dif
 checkpoints/sft_adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 checkpoints/grpo_adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 checkpoints/merged/tokenizer.json filter=lfs diff=lfs merge=lfs -text

 checkpoints/sft_adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 checkpoints/grpo_adapter/tokenizer.json filter=lfs diff=lfs merge=lfs -text
 checkpoints/merged/tokenizer.json filter=lfs diff=lfs merge=lfs -text
+outputs/plots/grpo_reward_curves.png filter=lfs diff=lfs merge=lfs -text
+outputs/plots/reward_component_bars.png filter=lfs diff=lfs merge=lfs -text

outputs/plots/anti_cheat_failure_rates.png ADDED Viewed

outputs/plots/avg_process_fidelity.png ADDED Viewed

outputs/plots/avg_reward.png ADDED Viewed

outputs/plots/grpo_reward_curves.png ADDED Viewed

Git LFS Details

SHA256: 193950a030afd8642db1cfe245bb47e89f88461f5a9682fede7228058253511b
Pointer size: 131 Bytes
Size of remote file: 147 kB

outputs/plots/inference_latency_validity.png ADDED Viewed

outputs/plots/inference_validity_reward.png ADDED Viewed

outputs/plots/legality_rate.png ADDED Viewed

outputs/plots/policy_stack_avg_reward.png ADDED Viewed

outputs/plots/qwen_model_grpo_reward.png ADDED Viewed

outputs/plots/qwen_model_sft_loss.png ADDED Viewed

outputs/plots/qwen_model_sft_reward.png ADDED Viewed

outputs/plots/reward_component_bars.png ADDED Viewed

Git LFS Details

SHA256: d1e967e0537bb1c49091b534231072fd2e4750d4650c02479fd292a1d5f543ae
Pointer size: 131 Bytes
Size of remote file: 123 kB

outputs/plots/sft_loss_curves.png ADDED Viewed

outputs/plots/sft_validity_reward.png ADDED Viewed

outputs/plots/sft_vs_grpo_reward.png ADDED Viewed

outputs/plots/success_rate.png ADDED Viewed

outputs/plots/train_holdout_gap.png ADDED Viewed