TheJackBright Claude Opus 4.6 commited on
Commit
3948a09
·
1 Parent(s): d5dbfe8

Set minimum shaped reward to 0.001 (strict >0)

Browse files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

backend/src/polypharmacy_env/rewards.py CHANGED
@@ -93,4 +93,4 @@ def compute_shaped_reward(
93
 
94
  # finish_review terminal bonus is added by the caller after grading
95
 
96
- return max(0.0, reward)
 
93
 
94
  # finish_review terminal bonus is added by the caller after grading
95
 
96
+ return max(0.001, reward)