| # CausalFinBench Results | |
| Date: 2026-05-08 00:12 | |
| Model: Horizon v1 (172M DiT, Pearl Level 3) | |
| Checkpoint: checkpoints/phase9_production/step_200000.pt | |
| Total time: 26.0 minutes | |
| ## Summary | |
| - Tier A: 5/5 PASS | |
| - Tier B: 2/3 PARTIAL | |
| - Tier C: 1/1 PASS | |
| **OVERALL: ISSUES FOUND** | |
| ## Detailed Results | |
| | Test | Tier | Name | Cases | Pass Rate | Result | | |
| |------|------|------|-------|-----------|--------| | |
| | A1 | A | Consistency | 100 | 100.0% | PASS | | |
| | A2 | A | Causal Asymmetry | 19 | 100.0% | PASS | | |
| | A3 | A | Compositionality | 30 | 93.3% | PASS | | |
| | A4 | A | Counterfactual Coherence | 50 | 92.0% | PASS | | |
| | A5 | A | Robustness | 56 | 100.0% | PASS | | |
| | B1 | B | Placebo (non-edges) | 39 | 59.0% | FAIL | | |
| | B2 | B | Real effects (edges) | 19 | 100.0% | PASS | | |
| | B4 | B | Sensitivity monotonicity | 10 | 100.0% | PASS | | |
| | C1 | C | RBI Rate Decisions | 42 | 100.0% | PASS | |