AmberLJC
/

gradient_clipping_experiment

AmberLJC commited on Jan 24

Commit

34b73d2

verified ·

1 Parent(s): 189bbf3

Upload progress.md with huggingface_hub

Files changed (1) hide show

progress.md ADDED Viewed

+# Progress Report - Gradient Clipping Experiment
+## Task Breakdown
+- [x] Step 1: Set up project structure
+- [x] Step 2: Implement PyTorch model (Embedding + Linear)
+- [x] Step 3: Create imbalanced dataset (990 'A', 10 'B')
+- [x] Step 4: Implement training loop WITHOUT clipping
+- [x] Step 5: Implement training loop WITH clipping
+- [x] Step 6: Generate comparison plots
+- [x] Step 7: Write summary report
+## Completion Status: ✅ COMPLETE
+## Key Results
+### Without Gradient Clipping:
+- Max Gradient Norm: 7.35
+- Final Weight Norm: 8.81
+- Final Loss: 0.0039
+### With Gradient Clipping (max_norm=1.0):
+- Max Gradient Norm: 7.60 (before clipping)
+- Final Weight Norm: 9.27
+- Final Loss: 0.0011
+## Conclusion
+The experiment confirms that gradient clipping stabilizes training by preventing sudden large weight updates from rare, high-loss samples. The clipped training showed smoother weight evolution and achieved slightly better final loss.