Test Fixes Applied for v3.0.0
Issue 1: Trajectory None guards (FIXED)
- File:
purpose_agent/types.py— UPDATED - Changed: cumulative_reward, total_delta, success_rate properties now check both
s.score is not NoneANDs.score.delta is not None - Added docstring note that sre_patches.py replaces these at import time
- Baseline and SRE-patched versions now equivalent
Issue 2: Backpressure test flakiness (NEEDS MANUAL FIX)
- File:
tests/test_sprint1_events.py— T1.6 section - Problem: async consumer may not start before flooding; terminal event might not arrive
- Fix: Replace the test_backpressure() function with this more robust version:
async def test_backpressure():
bus6 = EventBus(max_queue_size=3)
received = []
consumer_started = asyncio.Event()
async def consumer():
consumer_started.set()
try:
async for event in bus6.subscribe():
received.append(event)
await asyncio.sleep(0.01)
except asyncio.CancelledError:
pass
task = asyncio.create_task(consumer())
await consumer_started.wait()
await asyncio.sleep(0.05)
for i in range(20):
bus6.emit(create_event("r6", EventKind.TEXT_DELTA, seq=i, text=f"w{i}"))
bus6.emit(create_event("r6", EventKind.RUN_FINISHED, seq=99, result="done"))
await asyncio.sleep(1.0)
bus6.close()
task.cancel()
try:
await asyncio.wait_for(task, timeout=2.0)
except (asyncio.CancelledError, asyncio.TimeoutError):
pass
has_terminal = any(e.kind == EventKind.RUN_FINISHED for e in received)
return has_terminal
Key changes:
- Added
consumer_startedEvent to ensure consumer is running before flooding - Increased final wait from 0.5s to 1.0s
- Added
asyncio.wait_fortimeout on task cleanup
Issue 3: prod_test.py API timeout (NEEDS MANUAL FIX)
- File:
tests/prod_test.py - Problem: No timeout on OpenRouter API calls; tests could hang
- Fix: Wrap the backend creation with a timeout, add retry logic:
After line b = resolve_backend(...), add:
import signal
class TimeoutError(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutError("API call timed out")
# Set a 60s alarm for API calls
signal.signal(signal.SIGALRM, timeout_handler)
Or simpler: in the resolve_backend call, add timeout to the OpenAI client:
# In llm_backend.py OpenAICompatibleBackend.__init__, add:
self.client = OpenAI(
base_url=base_url,
api_key=api_key or os.environ.get("OPENAI_API_KEY"),
timeout=60.0, # 60 second timeout on all API calls
)
Issue 4: validate.py mock resilience (NEEDS MANUAL FIX)
- File:
benchmarks/validate.py - Problem: Mock matches on "Learned Strategies" + "None yet" text; fragile if prompt format changes
- Fix: In make_mock(), make the heuristic detection more resilient:
Change: has_h = "Learned Strategies" in text and "None yet" not in text
To: has_h = ("Learned Strategies" in text or "Learned Strategies" in text) and "None yet" not in text and "heuristics" in text.lower()
Or better: check the heuristic count directly:
has_h = any("When:" in line or "Do:" in line for line in text.split("\n"))
Issue 5: CalculatorTool import blocking (VERIFIED WORKING)
- File:
purpose_agent/tools.py - CalculatorTool.execute() validates tokens with:
if re.search(r'[a-zA-Z_]', tokens) - After removing known function names (abs, round, sqrt, etc.), any remaining letters are rejected
__import__("os")→ after removing known functions,__import__andosremain → rejected ✓- Also: AST walker checks Call nodes and rejects unknown function names
- eval() uses
{"__builtins__": {}}— no builtins available - Test in benchmark_v3.py:
check("tools.calc_blocks_import", "Error" in calc.run(expression='__import__("os")').output)— CORRECT