horizon-v1 / results /benchmark_report.md
QuantHive-Research-Tech's picture
Upload results/benchmark_report.md with huggingface_hub
7ffc277 verified

CausalFinBench Results

Date: 2026-05-08 00:12 Model: Horizon v1 (172M DiT, Pearl Level 3) Checkpoint: checkpoints/phase9_production/step_200000.pt Total time: 26.0 minutes

Summary

  • Tier A: 5/5 PASS
  • Tier B: 2/3 PARTIAL
  • Tier C: 1/1 PASS

OVERALL: ISSUES FOUND

Detailed Results

Test Tier Name Cases Pass Rate Result
A1 A Consistency 100 100.0% PASS
A2 A Causal Asymmetry 19 100.0% PASS
A3 A Compositionality 30 93.3% PASS
A4 A Counterfactual Coherence 50 92.0% PASS
A5 A Robustness 56 100.0% PASS
B1 B Placebo (non-edges) 39 59.0% FAIL
B2 B Real effects (edges) 19 100.0% PASS
B4 B Sensitivity monotonicity 10 100.0% PASS
C1 C RBI Rate Decisions 42 100.0% PASS