search-reranker-irishgov-l6-fast-v1

Fast 6-layer bilingual reranker for Irish government search.

This model is the current fast-path reranker line from the openMed reranker cycle. It is derived from cross-encoder/mmarco-mMiniLMv2-L12-H384-v1, reduced to 6 layers, then fine-tuned on a bilingual Irish government-oriented mix.

Intended use

  • Best use: stage-1 prefilter before a stronger reranker, or direct reranking when the effective candidate set is about 10-20.
  • Strong current serving pattern: l6-fast -> top10 -> search-reranker-broad-v1.
  • Not a full replacement for the broad reranker on large 80-candidate pools when maximum final quality matters.

Why publish this line

The broad reranker remains stronger on large candidate pools, but it is too expensive in the critical path when reranking large candidate sets on CPU.

This 6-layer line is valuable because:

  • it is much faster on CPU q8
  • it remains strong on the bilingual proxy and office-holder slices
  • it is the current best stage-1 model for the winning cascade

Key q8 benchmarks

Suite Config Overall MRR@10 Irish MRR@10 English MRR@10 Top-1 QPS
Public bilingual proxy 128, threads=8 0.9218 0.8837 0.9600 0.8600 26.52
Public bilingual proxy 192, threads=8 0.9238 0.8977 0.9500 0.8650 16.52
Office-holder eval 128, threads=16 1.0000 n/a n/a 1.0000 37.44
80-candidate in-domain holdout 128, threads=16 0.8327 0.7054 0.9600 0.7400 5.19

Best current cascade using this model

On the 80-candidate in-domain holdout, the strongest measured serving path in this cycle is:

  • l6-fast q8 -> top10 -> search-reranker-broad-v1 q8

Measured result:

  • overall MRR@10 = 0.9433
  • Irish MRR@10 = 0.8967
  • English MRR@10 = 0.9900
  • cascade query throughput = 15.20 qps

That cascade currently beats full broad reranking on the same 80-candidate holdout while being much faster.

Artifacts

  • Raw HF checkpoint: ./
  • ONNX dynamic q8 artifact: onnx/model_quantized.onnx

Training provenance

  • Base model: cross-encoder/mmarco-mMiniLMv2-L12-H384-v1
  • Training mix: bilingual Irish/English proxy mix plus Irish government office-holder supervision
  • Training metadata: training_info.json

Caveats

  • The q8 artifact is optimized for CPU inference, not for maximum compression ratio.
  • This model is best treated as a fast prefilter or low-latency reranker, not the final highest-quality reranker on very large candidate sets.

Portfolio comparison

Updated 2026-03-20 from local reranker reports only.

Use this section as the side-by-side public temsa reranker view. Cells are intentionally left out or summarized when the local report set does not contain a trustworthy non-quantized benchmark for the same public path.

General bilingual rerankers

Repo Primary role Non-quantized path Quantized path Extra trustworthy signals
temsa/search-reranker-broad-v1 Broad final-stage reranker ONNX fp32 l256: proxy 0.9490 / 2.08 qps Sibling q8 l160: proxy 0.9458 / 8.41 qps office 0.8056; hard-k10 0.9815; holdout-a03 0.8759
temsa/search-reranker-broad-v1-qint8 Broad CPU q8 sibling See temsa/search-reranker-broad-v1 ONNX q8 l160: proxy 0.9458 / 8.41 qps office 0.8056; hard-k10 0.9815; holdout-a03 0.8759
temsa/search-reranker-irishgov-l6-fast-v1 Fast stage-1 / cascade prefilter PT l192: proxy 0.9192 / 7.99 qps ONNX q8 l128: proxy 0.9218 / 27.25 qps office 1.0000; holdout-a03 0.9259
temsa/search-reranker-irishgov-l6-fast-v2 Fast K=10 policy serving release Same raw checkpoint family as v1; q8 with corrected temporal + office policy is the recommended path ONNX q8 l160/t10: corrected holdout-v5 1.0000 / 47.68 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9375
temsa/search-reranker-irishgov-l6-fast-v3 Fastest public K=10 serving profile Same raw checkpoint family as v2; the value is the shorter recommended serving length on corrected gates ONNX q8 l128/t10: corrected holdout-v5 1.0000 / 48.27 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444
temsa/search-reranker-irishgov-l6-fast-v4 Current best fast K=10 reranker Margin-MSE style continuation over v3; q8 per-channel is the recommended deployed artifact ONNX q8 per-channel l128/t10: corrected holdout-v5 1.0000 / 50.78 qps office-holdout-v3 1.0000; office-valid 1.0000; legacy-holdout-v4 0.9444
temsa/search-reranker-irishgov-l6-fast-v5 Fast K=10 serving release with shorter broad-policy route Same raw checkpoint family as v4; the value is a shorter broad-policy serving profile while keeping the office route unchanged ONNX q8 per-channel gov_broad_v1 l104/t10: corrected holdout-v5 1.0000 / 55.77 qps fresh-holdout-v6 0.8917; office-holdout-v3 1.0000; office-valid 1.0000
temsa/search-reranker-irishgov-l5-k10-v1 Fast K=10 direct reranker PT l160: proxy 0.8800 / 1.32 qps ONNX q8 l140: proxy 0.8872 / 28.72 qps office 1.0000; hard-k10 0.9815; holdout-a03 0.9630
temsa/search-reranker-irishgov-l5-k10-v2 Current K=10 successor PT l128: office 1.0000, finephrase 0.9405 ONNX q8 l140: proxy 0.8853 / 28.27 qps office 1.0000; holdout-a03 1.0000

Policy rerankers

Repo Primary role Non-quantized path Quantized path Extra trustworthy signals
temsa/search-reranker-broad-policy-v1 Broad policy-tuned reranker PT l224: policy-all 0.9270 / 7.94 qps ONNX q8 l224: policy-all 0.9205 / 27.55 qps office 0.9537; holdout-a04 0.9049
temsa/search-reranker-broad-policy-v3 Current broad policy successor ONNX fp32 l224: policy-all 0.9259 / 15.75 qps ONNX q8 reduce_range l224: policy-all 0.9268 / 30.12 qps office 0.9676; holdout-v3 0.9286
temsa/search-reranker-broad-policy-v4 Current broad policy serving release Same raw checkpoint family as v3; q8 gov_broad_v1 is the recommended path ONNX q8 reduce_range + gov_broad_v1 l224: policy-all 0.9257 / 31.64 qps office 0.9676; holdout-v3 0.9271; holdout-v4 0.9583
temsa/search-reranker-broad-policy-v5 Current broad policy serving release Same raw checkpoint family as v4; q8 gov_broad_v1 is the recommended path ONNX q8 reduce_range + gov_broad_v1 l224/t10: policy-all 0.9711 / 26.12 qps office 1.0000; holdout-v3 1.0000; holdout-v4 1.0000
temsa/search-reranker-broad-policy-v6 Broad policy serving release with corrected temporal gate Same raw checkpoint family as v5; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l176/t10: corrected holdout-v5 0.8657 / 25.74 qps office-l160 1.0000; legacy-holdout-v4 0.9444; legacy-policy-all 0.9167
temsa/search-reranker-broad-policy-v7 Broad policy serving release with stronger latest-turn and topic-specific news routing Same raw checkpoint family as v6; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l176/t10: fresh holdout-v6 0.9375 / 15.65 qps office-l224 1.0000; office-holdout-v4 0.9659; corrected-holdout-v5 1.0000; legacy-policy-all 0.9196
temsa/search-reranker-broad-policy-v8 Broad policy serving-profile release with shorter broad and office routes Same raw checkpoint family as v7; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l160/t10: fresh holdout-v7 0.9500 / 29.85 qps office-l208 1.0000; office-holdout-v3 1.0000; office-holdout-v4 0.9659; fresh-holdout-v6 0.9500
temsa/search-reranker-broad-policy-v9 Broad policy serving release with stronger year-specific change-history routing Same raw checkpoint family as v8; q8 gov_broad_v1 + office_holder_v1 is the recommended path ONNX q8 gov_broad_v1 l160/t10: gov.ie category-policy corrected 1.0000 / 26.01 qps legacy-policy-all 0.9282; fresh-holdout-v6 0.9500; office-valid 1.0000; office-holdout-v4 0.9659

Dynamic q8 CPU at K=20

These rows use the current local K=20 benchmark harness:

  • proxy_k20: policy=none
  • broad_policy_k20: policy=gov_broad_v1
  • office_holder_k20: policy=office_holder_v1
  • fixed threads=10, batch_size=32
Repo Broad len Office len Proxy K=20 Broad-policy K=20 Office-holder K=20 Notes
temsa/search-reranker-broad-v1 160 160 0.9458 / 6.93 qps 0.9437 / 9.10 qps 0.6073 / 8.23 qps broad baseline
temsa/search-reranker-broad-v1-qint8 160 160 0.9458 / 6.65 qps 0.9437 / 6.62 qps 0.6073 / 5.91 qps q8 sibling repo
temsa/search-reranker-broad-policy-v1 224 224 0.9414 / 7.20 qps 0.9125 / 11.32 qps 0.6966 / 6.49 qps first policy model
temsa/search-reranker-broad-policy-v2 224 224 0.9414 / 6.44 qps 0.9125 / 9.40 qps 0.6966 / 7.75 qps office-policy serving fix era
temsa/search-reranker-broad-policy-v3 224 224 0.9364 / 7.38 qps 0.9250 / 12.20 qps 0.7451 / 7.15 qps custom CPU continuation
temsa/search-reranker-broad-policy-v4 224 224 0.9364 / 7.16 qps 0.9250 / 11.40 qps 0.7451 / 6.90 qps gov_broad_v1 serving route
temsa/search-reranker-broad-policy-v5 224 224 0.9364 / 7.00 qps 0.9250 / 12.35 qps 0.7451 / 6.47 qps stronger serving release
temsa/search-reranker-broad-policy-v6 176 160 0.9433 / 8.09 qps 0.9250 / 9.97 qps 0.7186 / 9.77 qps corrected temporal gate
temsa/search-reranker-broad-policy-v7 176 224 0.9433 / 7.91 qps 0.9250 / 12.11 qps 0.7451 / 7.49 qps stronger latest-turn routing
temsa/search-reranker-broad-policy-v8 160 208 0.9371 / 9.49 qps 0.9250 / 11.21 qps 0.7527 / 7.01 qps current broad fallback
temsa/search-reranker-broad-policy-v9 160 208 0.9371 / 8.97 qps 0.9250 / 11.56 qps 0.7527 / 6.62 qps current broad fallback + change-history fix
temsa/search-reranker-irishgov-l5-k10-v1 140 140 0.8872 / 15.56 qps 0.8938 / 21.68 qps 0.6751 / 21.01 qps fast K=10 direct
temsa/search-reranker-irishgov-l5-k10-v2 140 140 0.8853 / 18.06 qps 0.8854 / 18.87 qps 0.6327 / 17.93 qps fast K=10 successor
temsa/search-reranker-irishgov-l6-fast-v1 128 128 0.9286 / 18.70 qps 0.9042 / 17.65 qps 0.6934 / 20.36 qps fast stage-1 v1
temsa/search-reranker-irishgov-l6-fast-v2 128 128 0.9286 / 18.59 qps 0.9042 / 19.67 qps 0.6934 / 19.30 qps fast stage-1 v2
temsa/search-reranker-irishgov-l6-fast-v3 128 128 0.9286 / 18.48 qps 0.9042 / 18.60 qps 0.6934 / 19.30 qps shorter serving profile
temsa/search-reranker-irishgov-l6-fast-v4 128 128 0.9184 / 17.66 qps 0.9008 / 21.83 qps 0.7197 / 17.31 qps margin-MSE per-channel
temsa/search-reranker-irishgov-l6-fast-v5 104 128 0.8987 / 20.85 qps 0.9133 / 20.29 qps 0.7197 / 15.45 qps current fast route

Intentional gaps:

  • search-reranker-broad-v1: the local reports include strong fp32 ONNX proxy and office data, but not a matching fp32 in-domain finephrase / hard-K10 / holdout-A03 set, so those are not claimed here.
  • search-reranker-irishgov-l5-k10-v2: the local non-quantized reports are trustworthy for office / finephrase / hard-K10, but not for the same proxy runtime shape as the shipped q8 path.
  • search-reranker-irishgov-l6-fast-v1: the local non-quantized reports cover proxy and office, but not the fresh holdout-A03 slice used for the q8 K=10 comparison.
Downloads last month
8
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for temsa/search-reranker-irishgov-l6-fast-v1