Beemer Claude Opus 4.7 commited on
Commit
4066df3
·
1 Parent(s): d72272a

Tune the semantic-fusion weight to 2.0

Browse files

Equal-weight reciprocal-rank fusion diluted strong semantic hits when
BM25 ranked the same provision poorly. W_SEM=2.0 lifts the 89-question
eval -- Hit@1 0.57->0.65, Hit@5 0.88->0.90, MRR 0.70->0.75 -- with no
regression. See precision-findings.md.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Files changed (1) hide show
  1. canlex/index.py +1 -1
canlex/index.py CHANGED
@@ -13,7 +13,7 @@ from .synonyms import expand_query
13
  K1 = 1.5
14
  B = 0.75
15
  RRF_K = 60 # reciprocal-rank-fusion damping constant
16
- W_SEM = 1.0 # weight on the semantic retriever in the fusion (1.0 = equal)
17
  CANDIDATES = 80 # hits each retriever contributes to the fusion
18
  RERANK_POOL = 50 # top fused candidates the cross-encoder rescores
19
  SOURCE_CAP = 2 # max chunks one case/memo/agreement/directive may contribute
 
13
  K1 = 1.5
14
  B = 0.75
15
  RRF_K = 60 # reciprocal-rank-fusion damping constant
16
+ W_SEM = 2.0 # weight on the semantic retriever in the fusion (1.0 = equal; eval-tuned)
17
  CANDIDATES = 80 # hits each retriever contributes to the fusion
18
  RERANK_POOL = 50 # top fused candidates the cross-encoder rescores
19
  SOURCE_CAP = 2 # max chunks one case/memo/agreement/directive may contribute