CrossEncoder based on BAAI/bge-reranker-v2-m3

This is a Cross Encoder model finetuned from BAAI/bge-reranker-v2-m3 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: BAAI/bge-reranker-v2-m3
  • Maximum Sequence Length: 1024 tokens
  • Number of Output Labels: 1 label

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("cross_encoder_model_id")
# Get scores for pairs of texts
pairs = [
    ['Margraviate of the country of the Botanical Garden of the place Josef Victor Rohon was educated is an instance of?', "New York City. The Great Irish Famine brought a large influx of Irish immigrants. Over 200,000 were living in New York by 1860, upwards of a quarter of the city's population. There was also extensive immigration from the German provinces, where revolutions had disrupted societies, and Germans comprised another 25% of New York's population by 1860."],
    ['What is the equivalent of the country having signed an agreement with Nasser in 1954 for the agency appointing the members of the public company accounting oversight board?', 'Monographs in Systematic Botany. Monographs in Systematic Botany also known as Monographs in Systematic Botany from the Missouri Botanical Garden is a series of monographs relating to the study of systematic botany. It is published by the Missouri Botanical Garden Press.'],
    ["What county is Charlotte Ray's birth city located?", 'Charlotte Ray. Charlotte Ray is a beauty queen from Camden, New Jersey who competed in the Miss USA and Miss World pageants.'],
    ['When did the name Black Death officially take root in the country where the author of Remains of Elmet is from?', 'Black Death. Gasquet (1908) claimed that the Latin name atra mors (Black Death) for the 14th-century epidemic first appeared in modern times in 1631 in a book on Danish history by J.I. Pontanus: "Vulgo & ab effectu atram mortem vocatibant. ("Commonly and from its effects, they called it the black death"). The name spread through Scandinavia and then Germany, gradually becoming attached to the mid 14th-century epidemic as a proper name. In England, it was not until 1823 that the medieval epidemic was first called the Black Death.'],
    ['When was the armistice signed between the Central powers and the country whose capitol was home to the man after whom Korolyov was named?', "Modern history. Another action in 1917 that is of note was the armistice signed between Russia and the Central Powers at Brest-Litovsk. As a condition for peace, the treaty by the Central Powers conceded huge portions of the former Russian Empire to Imperial Germany and the Ottoman Empire, greatly upsetting nationalists and conservatives. The Bolsheviks made peace with the German Empire and the Central Powers, as they had promised the Russian people prior to the Revolution. Vladimir Lenin's decision has been attributed to his sponsorship by the foreign office of Wilhelm II, German Emperor, offered by the latter in hopes that with a revolution, Russia would withdraw from World War I. This suspicion was bolstered by the German Foreign Ministry's sponsorship of Lenin's return to Petrograd. The Western Allies expressed their dismay at the Bolsheviks, upset at:"],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Margraviate of the country of the Botanical Garden of the place Josef Victor Rohon was educated is an instance of?',
    [
        "New York City. The Great Irish Famine brought a large influx of Irish immigrants. Over 200,000 were living in New York by 1860, upwards of a quarter of the city's population. There was also extensive immigration from the German provinces, where revolutions had disrupted societies, and Germans comprised another 25% of New York's population by 1860.",
        'Monographs in Systematic Botany. Monographs in Systematic Botany also known as Monographs in Systematic Botany from the Missouri Botanical Garden is a series of monographs relating to the study of systematic botany. It is published by the Missouri Botanical Garden Press.',
        'Charlotte Ray. Charlotte Ray is a beauty queen from Camden, New Jersey who competed in the Miss USA and Miss World pageants.',
        'Black Death. Gasquet (1908) claimed that the Latin name atra mors (Black Death) for the 14th-century epidemic first appeared in modern times in 1631 in a book on Danish history by J.I. Pontanus: "Vulgo & ab effectu atram mortem vocatibant. ("Commonly and from its effects, they called it the black death"). The name spread through Scandinavia and then Germany, gradually becoming attached to the mid 14th-century epidemic as a proper name. In England, it was not until 1823 that the medieval epidemic was first called the Black Death.',
        "Modern history. Another action in 1917 that is of note was the armistice signed between Russia and the Central Powers at Brest-Litovsk. As a condition for peace, the treaty by the Central Powers conceded huge portions of the former Russian Empire to Imperial Germany and the Ottoman Empire, greatly upsetting nationalists and conservatives. The Bolsheviks made peace with the German Empire and the Central Powers, as they had promised the Russian people prior to the Revolution. Vladimir Lenin's decision has been attributed to his sponsorship by the foreign office of Wilhelm II, German Emperor, offered by the latter in hopes that with a revolution, Russia would withdraw from World War I. This suspicion was bolstered by the German Foreign Ministry's sponsorship of Lenin's return to Petrograd. The Western Allies expressed their dismay at the Bolsheviks, upset at:",
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Binary Classification

Metric validation train_subset
accuracy 0.9748 1.0
accuracy_threshold 0.055 0.0001
f1 0.9744 1.0
f1_threshold 0.055 0.0001
precision 0.9913 1.0
recall 0.958 1.0
average_precision 0.9975 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 232 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 232 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 39 characters
    • mean: 95.96 characters
    • max: 173 characters
    • min: 116 characters
    • mean: 517.13 characters
    • max: 1664 characters
    • min: 0.0
    • mean: 0.5
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    Margraviate of the country of the Botanical Garden of the place Josef Victor Rohon was educated is an instance of? New York City. The Great Irish Famine brought a large influx of Irish immigrants. Over 200,000 were living in New York by 1860, upwards of a quarter of the city's population. There was also extensive immigration from the German provinces, where revolutions had disrupted societies, and Germans comprised another 25% of New York's population by 1860. 0.0
    What is the equivalent of the country having signed an agreement with Nasser in 1954 for the agency appointing the members of the public company accounting oversight board? Monographs in Systematic Botany. Monographs in Systematic Botany also known as Monographs in Systematic Botany from the Missouri Botanical Garden is a series of monographs relating to the study of systematic botany. It is published by the Missouri Botanical Garden Press. 0.0
    What county is Charlotte Ray's birth city located? Charlotte Ray. Charlotte Ray is a beauty queen from Camden, New Jersey who competed in the Miss USA and Miss World pageants. 1.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step validation_average_precision train_subset_average_precision
1.0 58 0.9975 -
1.7241 100 0.9942 1.0000
2.0 116 0.9968 -
3.0 174 0.9975 -

Framework Versions

  • Python: 3.11.6
  • Sentence Transformers: 5.2.0
  • Transformers: 4.44.2
  • PyTorch: 2.9.1+cu128
  • Accelerate: 1.12.0
  • Datasets: 4.0.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
1
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for OloriBern/musique-bge-m3-50

Finetuned
(69)
this model

Paper for OloriBern/musique-bge-m3-50

Evaluation results