CrossEncoder based on cross-encoder/ms-marco-MiniLM-L6-v2

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Full Model Architecture

CrossEncoder(
  (0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': 'BertForSequenceClassification'})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("jmroth/nlp-reranker-finetuned-optim")
# Get scores for pairs of inputs
pairs = [
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Higher carbon dioxide concentrations will favourably affect plant growth and demand for water.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Use of fertilizers are beneficial in providing nutrients to plants although they have some negative environmental effects.'],
    ['Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.', 'Studies have shown that higher CO2 levels lead to reduced plant uptake of nitrogen (and a smaller number showing the same for trace elements such as zinc) resulting in crops with lower nutritional value.'],
]
scores = model.predict(pairs)
print(scores)
# [0.9202 0.8442 0.8343 0.3074 0.9641]

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life.',
    [
        'At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse.',
        'Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients.',
        'Higher carbon dioxide concentrations will favourably affect plant growth and demand for water.',
        'Use of fertilizers are beneficial in providing nutrients to plants although they have some negative environmental effects.',
        'Studies have shown that higher CO2 levels lead to reduced plant uptake of nitrogen (and a smaller number showing the same for trace elements such as zinc) resulting in crops with lower nutritional value.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

Metric Value
map 0.2953
mrr@10 0.4032
ndcg@10 0.3377

Training Details

Training Dataset

Unnamed Dataset

  • Size: 16,402 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 9 tokens
    • mean: 26.48 tokens
    • max: 54 tokens
    • min: 4 tokens
    • mean: 33.8 tokens
    • max: 475 tokens
    • min: 0.0
    • mean: 0.24
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. At very high concentrations (100 times atmospheric concentration, or greater), carbon dioxide can be toxic to animal life, so raising the concentration to 10,000 ppm (1%) or higher for several hours will eliminate pests such as whiteflies and spider mites in a greenhouse. 1.0
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. Plants can grow as much as 50 percent faster in concentrations of 1,000 ppm CO 2 when compared with ambient conditions, though this assumes no change in climate and no limitation on other nutrients. 1.0
    Not only is there no scientific evidence that CO2 is a pollutant, higher CO2 concentrations actually help ecosystems support more plant and animal life. Higher carbon dioxide concentrations will favourably affect plant growth and demand for water. 1.0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": 2.9791362285614014
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 16
  • learning_rate: 1.4319807550234704e-06
  • weight_decay: 0.01
  • num_train_epochs: 1
  • warmup_steps: 0.1
  • fp16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • do_predict: False
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1.4319807550234704e-06
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: None
  • warmup_ratio: None
  • warmup_steps: 0.1
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • enable_jit_checkpoint: False
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • use_cpu: False
  • seed: 42
  • data_seed: None
  • bf16: False
  • fp16: True
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: -1
  • ddp_backend: None
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch_fused
  • optim_args: None
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • auto_find_batch_size: False
  • full_determinism: False
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • use_cache: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss claims-rerank-dev_ndcg@10
0.0097 10 2.7478 -
0.0195 20 3.5633 -
0.0292 30 3.1951 -
0.0390 40 4.3874 -
0.0487 50 4.1538 -
0.0585 60 2.9735 -
0.0682 70 2.8458 -
0.0780 80 3.1712 -
0.0877 90 2.1984 -
0.0975 100 1.9342 -
0.1072 110 2.7962 -
0.1170 120 2.4899 -
0.1267 130 2.9049 -
0.1365 140 2.8282 -
0.1462 150 2.3681 -
0.1559 160 2.1871 -
0.1657 170 2.6121 -
0.1754 180 2.9096 -
0.1852 190 2.0653 -
0.1949 200 2.2589 -
0.2047 210 2.0209 -
0.2144 220 2.4166 -
0.2242 230 2.4957 -
0.2339 240 2.2457 -
0.2437 250 2.3880 -
0.2534 260 1.9401 -
0.2632 270 2.4819 -
0.2729 280 1.9734 -
0.2827 290 1.9620 -
0.2924 300 1.6221 -
0.3021 310 2.1036 -
0.3119 320 2.3999 -
0.3216 330 2.1864 -
0.3314 340 2.2174 -
0.3411 350 1.5773 -
0.3509 360 1.8879 -
0.3606 370 2.0436 -
0.3704 380 1.6877 -
0.3801 390 1.9070 -
0.3899 400 2.1434 -
0.3996 410 2.0888 -
0.4094 420 2.0225 -
0.4191 430 1.7573 -
0.4288 440 1.7710 -
0.4386 450 1.7125 -
0.4483 460 1.8041 -
0.4581 470 1.9516 -
0.4678 480 1.7080 -
0.4776 490 1.6173 -
0.4873 500 1.9432 -
0.4971 510 1.6012 -
0.5068 520 1.9219 -
0.5166 530 1.7415 -
0.5263 540 1.4378 -
0.5361 550 1.5332 -
0.5458 560 1.3974 -
0.5556 570 1.4664 -
0.5653 580 1.5879 -
0.5750 590 1.5318 -
0.5848 600 1.2995 -
0.5945 610 1.4182 -
0.6043 620 1.1907 -
0.6140 630 1.3132 -
0.6238 640 2.3528 -
0.6335 650 1.4836 -
0.6433 660 2.0659 -
0.6530 670 1.6562 -
0.6628 680 1.4280 -
0.6725 690 1.8156 -
0.6823 700 1.3682 -
0.6920 710 1.4824 -
0.7018 720 1.6727 -
0.7115 730 1.5780 -
0.7212 740 1.2100 -
0.7310 750 1.8286 -
0.7407 760 1.4813 -
0.7505 770 1.4779 -
0.7602 780 1.4555 -
0.7700 790 1.9817 -
0.7797 800 1.6229 -
0.7895 810 1.5417 -
0.7992 820 1.4371 -
0.8090 830 1.5260 -
0.8187 840 1.4955 -
0.8285 850 1.5744 -
0.8382 860 1.3485 -
0.8480 870 2.0583 -
0.8577 880 1.5225 -
0.8674 890 1.1889 -
0.8772 900 1.7914 -
0.8869 910 1.7549 -
0.8967 920 1.3538 -
0.9064 930 1.7872 -
0.9162 940 1.7946 -
0.9259 950 1.4399 -
0.9357 960 1.3736 -
0.9454 970 1.7688 -
0.9552 980 1.2053 -
0.9649 990 1.5866 -
0.9747 1000 1.7944 -
0.9844 1010 1.6542 -
0.9942 1020 1.2554 -
1.0 1026 - 0.3377
  • The bold row denotes the saved checkpoint.

Training Time

  • Training: 52.2 seconds

Framework Versions

  • Python: 3.12.13
  • Sentence Transformers: 5.4.1
  • Transformers: 5.0.0
  • PyTorch: 2.10.0+cu128
  • Accelerate: 1.13.0
  • Datasets: 4.0.0
  • Tokenizers: 0.22.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
44
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jmroth/nlp-reranker-finetuned-final

Paper for jmroth/nlp-reranker-finetuned-final

Evaluation results