search-reranker-irishgov-l6-fast-v1
Fast 6-layer bilingual reranker for Irish government search.
This model is the current fast-path reranker line from the openMed reranker cycle. It is derived from cross-encoder/mmarco-mMiniLMv2-L12-H384-v1, reduced to 6 layers, then fine-tuned on a bilingual Irish government-oriented mix.
Intended use
- Best use: stage-1 prefilter before a stronger reranker, or direct reranking when the effective candidate set is about
10-20. - Strong current serving pattern:
l6-fast -> top10 -> search-reranker-broad-v1. - Not a full replacement for the broad reranker on large
80-candidate pools when maximum final quality matters.
Why publish this line
The broad reranker remains stronger on large candidate pools, but it is too expensive in the critical path when reranking large candidate sets on CPU.
This 6-layer line is valuable because:
- it is much faster on CPU q8
- it remains strong on the bilingual proxy and office-holder slices
- it is the current best stage-1 model for the winning cascade
Key q8 benchmarks
| Suite | Config | Overall MRR@10 | Irish MRR@10 | English MRR@10 | Top-1 | QPS |
|---|---|---|---|---|---|---|
| Public bilingual proxy | 128, threads=8 |
0.9218 |
0.8837 |
0.9600 |
0.8600 |
26.52 |
| Public bilingual proxy | 192, threads=8 |
0.9238 |
0.8977 |
0.9500 |
0.8650 |
16.52 |
| Office-holder eval | 128, threads=16 |
1.0000 |
n/a |
n/a |
1.0000 |
37.44 |
| 80-candidate in-domain holdout | 128, threads=16 |
0.8327 |
0.7054 |
0.9600 |
0.7400 |
5.19 |
Best current cascade using this model
On the 80-candidate in-domain holdout, the strongest measured serving path in this cycle is:
l6-fast q8 -> top10 -> search-reranker-broad-v1 q8
Measured result:
- overall
MRR@10 = 0.9433 - Irish
MRR@10 = 0.8967 - English
MRR@10 = 0.9900 - cascade query throughput
= 15.20 qps
That cascade currently beats full broad reranking on the same 80-candidate holdout while being much faster.
Artifacts
- Raw HF checkpoint:
./ - ONNX dynamic q8 artifact:
onnx/model_quantized.onnx
Training provenance
- Base model:
cross-encoder/mmarco-mMiniLMv2-L12-H384-v1 - Training mix: bilingual Irish/English proxy mix plus Irish government office-holder supervision
- Training metadata:
training_info.json
Caveats
- The q8 artifact is optimized for CPU inference, not for maximum compression ratio.
- This model is best treated as a fast prefilter or low-latency reranker, not the final highest-quality reranker on very large candidate sets.
Portfolio comparison
Updated 2026-03-20 from local reranker reports only.
Use this section as the side-by-side public temsa reranker view. Cells are intentionally left out or summarized when the local report set does not contain a trustworthy non-quantized benchmark for the same public path.
General bilingual rerankers
| Repo | Primary role | Non-quantized path | Quantized path | Extra trustworthy signals |
|---|---|---|---|---|
temsa/search-reranker-broad-v1 |
Broad final-stage reranker | ONNX fp32 l256: proxy 0.9490 / 2.08 qps |
Sibling q8 l160: proxy 0.9458 / 8.41 qps |
office 0.8056; hard-k10 0.9815; holdout-a03 0.8759 |
temsa/search-reranker-broad-v1-qint8 |
Broad CPU q8 sibling | See temsa/search-reranker-broad-v1 |
ONNX q8 l160: proxy 0.9458 / 8.41 qps |
office 0.8056; hard-k10 0.9815; holdout-a03 0.8759 |
temsa/search-reranker-irishgov-l6-fast-v1 |
Fast stage-1 / cascade prefilter | PT l192: proxy 0.9192 / 7.99 qps |
ONNX q8 l128: proxy 0.9218 / 27.25 qps |
office 1.0000; holdout-a03 0.9259 |
temsa/search-reranker-irishgov-l6-fast-v2 |
Fast K=10 policy serving release | Same raw checkpoint family as v1; q8 with corrected temporal + office policy is the recommended path | ONNX q8 l160/t10: corrected holdout-v5 1.0000 / 47.68 qps |
office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9375 |
temsa/search-reranker-irishgov-l6-fast-v3 |
Fastest public K=10 serving profile | Same raw checkpoint family as v2; the value is the shorter recommended serving length on corrected gates | ONNX q8 l128/t10: corrected holdout-v5 1.0000 / 48.27 qps |
office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444 |
temsa/search-reranker-irishgov-l6-fast-v4 |
Current best fast K=10 reranker | Margin-MSE style continuation over v3; q8 per-channel is the recommended deployed artifact | ONNX q8 per-channel l128/t10: corrected holdout-v5 1.0000 / 50.78 qps |
office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444 |
temsa/search-reranker-irishgov-l6-fast-v5 |
Fast K=10 serving release with shorter broad-policy route | Same raw checkpoint family as v4; the value is a shorter broad-policy serving profile while keeping the office route unchanged | ONNX q8 per-channel gov_broad_v1 l104/t10: corrected holdout-v5 1.0000 / 55.77 qps |
fresh-holdout-v6 0.8917; office-holdout-v3 1.0000; office-valid 1.0000 |
temsa/search-reranker-irishgov-l5-k10-v1 |
Fast K=10 direct reranker | PT l160: proxy 0.8800 / 1.32 qps |
ONNX q8 l140: proxy 0.8872 / 28.72 qps |
office 1.0000; hard-k10 0.9815; holdout-a03 0.9630 |
temsa/search-reranker-irishgov-l5-k10-v2 |
Current K=10 successor | PT l128: office 1.0000, finephrase 0.9405 |
ONNX q8 l140: proxy 0.8853 / 28.27 qps |
office 1.0000; holdout-a03 1.0000 |
Policy rerankers
| Repo | Primary role | Non-quantized path | Quantized path | Extra trustworthy signals |
|---|---|---|---|---|
temsa/search-reranker-broad-policy-v1 |
Broad policy-tuned reranker | PT l224: policy-all 0.9270 / 7.94 qps |
ONNX q8 l224: policy-all 0.9205 / 27.55 qps |
office 0.9537; holdout-a04 0.9049 |
temsa/search-reranker-broad-policy-v3 |
Current broad policy successor | ONNX fp32 l224: policy-all 0.9259 / 15.75 qps |
ONNX q8 reduce_range l224: policy-all 0.9268 / 30.12 qps |
office 0.9676; holdout-v3 0.9286 |
temsa/search-reranker-broad-policy-v4 |
Current broad policy serving release | Same raw checkpoint family as v3; q8 gov_broad_v1 is the recommended path | ONNX q8 reduce_range + gov_broad_v1 l224: policy-all 0.9257 / 31.64 qps |
office 0.9676; holdout-v3 0.9271; holdout-v4 0.9583 |
temsa/search-reranker-broad-policy-v5 |
Current broad policy serving release | Same raw checkpoint family as v4; q8 gov_broad_v1 is the recommended path | ONNX q8 reduce_range + gov_broad_v1 l224/t10: policy-all 0.9711 / 26.12 qps |
office 1.0000; holdout-v3 1.0000; holdout-v4 1.0000 |
temsa/search-reranker-broad-policy-v6 |
Broad policy serving release with corrected temporal gate | Same raw checkpoint family as v5; q8 gov_broad_v1 + office_holder_v1 is the recommended path | ONNX q8 gov_broad_v1 l176/t10: corrected holdout-v5 0.8657 / 25.74 qps |
office-l160 1.0000; legacy-holdout-v4 0.9444; legacy-policy-all 0.9167 |
temsa/search-reranker-broad-policy-v7 |
Broad policy serving release with stronger latest-turn and topic-specific news routing | Same raw checkpoint family as v6; q8 gov_broad_v1 + office_holder_v1 is the recommended path | ONNX q8 gov_broad_v1 l176/t10: fresh holdout-v6 0.9375 / 15.65 qps |
office-l224 1.0000; office-holdout-v4 0.9659; corrected-holdout-v5 1.0000; legacy-policy-all 0.9196 |
temsa/search-reranker-broad-policy-v8 |
Broad policy serving-profile release with shorter broad and office routes | Same raw checkpoint family as v7; q8 gov_broad_v1 + office_holder_v1 is the recommended path | ONNX q8 gov_broad_v1 l160/t10: fresh holdout-v7 0.9500 / 29.85 qps |
office-l208 1.0000; office-holdout-v3 1.0000; office-holdout-v4 0.9659; fresh-holdout-v6 0.9500 |
temsa/search-reranker-broad-policy-v9 |
Broad policy serving release with stronger year-specific change-history routing | Same raw checkpoint family as v8; q8 gov_broad_v1 + office_holder_v1 is the recommended path | ONNX q8 gov_broad_v1 l160/t10: gov.ie category-policy corrected 1.0000 / 26.01 qps |
legacy-policy-all 0.9282; fresh-holdout-v6 0.9500; office-valid 1.0000; office-holdout-v4 0.9659 |
Dynamic q8 CPU at K=20
These rows use the current local K=20 benchmark harness:
proxy_k20:policy=nonebroad_policy_k20:policy=gov_broad_v1office_holder_k20:policy=office_holder_v1- fixed
threads=10,batch_size=32
| Repo | Broad len | Office len | Proxy K=20 | Broad-policy K=20 | Office-holder K=20 | Notes |
|---|---|---|---|---|---|---|
temsa/search-reranker-broad-v1 |
160 | 160 | 0.9458 / 6.93 qps | 0.9437 / 9.10 qps | 0.6073 / 8.23 qps | broad baseline |
temsa/search-reranker-broad-v1-qint8 |
160 | 160 | 0.9458 / 6.65 qps | 0.9437 / 6.62 qps | 0.6073 / 5.91 qps | q8 sibling repo |
temsa/search-reranker-broad-policy-v1 |
224 | 224 | 0.9414 / 7.20 qps | 0.9125 / 11.32 qps | 0.6966 / 6.49 qps | first policy model |
temsa/search-reranker-broad-policy-v2 |
224 | 224 | 0.9414 / 6.44 qps | 0.9125 / 9.40 qps | 0.6966 / 7.75 qps | office-policy serving fix era |
temsa/search-reranker-broad-policy-v3 |
224 | 224 | 0.9364 / 7.38 qps | 0.9250 / 12.20 qps | 0.7451 / 7.15 qps | custom CPU continuation |
temsa/search-reranker-broad-policy-v4 |
224 | 224 | 0.9364 / 7.16 qps | 0.9250 / 11.40 qps | 0.7451 / 6.90 qps | gov_broad_v1 serving route |
temsa/search-reranker-broad-policy-v5 |
224 | 224 | 0.9364 / 7.00 qps | 0.9250 / 12.35 qps | 0.7451 / 6.47 qps | stronger serving release |
temsa/search-reranker-broad-policy-v6 |
176 | 160 | 0.9433 / 8.09 qps | 0.9250 / 9.97 qps | 0.7186 / 9.77 qps | corrected temporal gate |
temsa/search-reranker-broad-policy-v7 |
176 | 224 | 0.9433 / 7.91 qps | 0.9250 / 12.11 qps | 0.7451 / 7.49 qps | stronger latest-turn routing |
temsa/search-reranker-broad-policy-v8 |
160 | 208 | 0.9371 / 9.49 qps | 0.9250 / 11.21 qps | 0.7527 / 7.01 qps | current broad fallback |
temsa/search-reranker-broad-policy-v9 |
160 | 208 | 0.9371 / 8.97 qps | 0.9250 / 11.56 qps | 0.7527 / 6.62 qps | current broad fallback + change-history fix |
temsa/search-reranker-irishgov-l5-k10-v1 |
140 | 140 | 0.8872 / 15.56 qps | 0.8938 / 21.68 qps | 0.6751 / 21.01 qps | fast K=10 direct |
temsa/search-reranker-irishgov-l5-k10-v2 |
140 | 140 | 0.8853 / 18.06 qps | 0.8854 / 18.87 qps | 0.6327 / 17.93 qps | fast K=10 successor |
temsa/search-reranker-irishgov-l6-fast-v1 |
128 | 128 | 0.9286 / 18.70 qps | 0.9042 / 17.65 qps | 0.6934 / 20.36 qps | fast stage-1 v1 |
temsa/search-reranker-irishgov-l6-fast-v2 |
128 | 128 | 0.9286 / 18.59 qps | 0.9042 / 19.67 qps | 0.6934 / 19.30 qps | fast stage-1 v2 |
temsa/search-reranker-irishgov-l6-fast-v3 |
128 | 128 | 0.9286 / 18.48 qps | 0.9042 / 18.60 qps | 0.6934 / 19.30 qps | shorter serving profile |
temsa/search-reranker-irishgov-l6-fast-v4 |
128 | 128 | 0.9184 / 17.66 qps | 0.9008 / 21.83 qps | 0.7197 / 17.31 qps | margin-MSE per-channel |
temsa/search-reranker-irishgov-l6-fast-v5 |
104 | 128 | 0.8987 / 20.85 qps | 0.9133 / 20.29 qps | 0.7197 / 15.45 qps | current fast route |
Intentional gaps:
search-reranker-broad-v1: the local reports include strong fp32 ONNX proxy and office data, but not a matching fp32 in-domain finephrase / hard-K10 / holdout-A03 set, so those are not claimed here.search-reranker-irishgov-l5-k10-v2: the local non-quantized reports are trustworthy for office / finephrase / hard-K10, but not for the same proxy runtime shape as the shipped q8 path.search-reranker-irishgov-l6-fast-v1: the local non-quantized reports cover proxy and office, but not the fresh holdout-A03 slice used for the q8 K=10 comparison.
- Downloads last month
- 8