CrossEncoder based on jhu-clsp/ettin-encoder-68m

This is a Cross Encoder model finetuned from jhu-clsp/ettin-encoder-68m on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: jhu-clsp/ettin-encoder-68m
  • Maximum Sequence Length: 7999 tokens
  • Number of Output Labels: 1 label
  • Training Dataset:
  • Language: en

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("bansalaman18/reranker-msmarco-v1.1-ettin-encoder-68m-listnet")
# Get scores for pairs of texts
pairs = [
    ['what is the name of the vaccine for MMR', 'Measles, mumps and rubella (MMR) vaccines. The MMR vaccine also comes in combination with chickenpox (MMRV) for 18 month old children and contains small amounts of each of the viruses at a reduced strength and a small amount of the antibiotic neomycin. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensure they have had two doses of a measles containing vaccine.'],
    ['what is the name of the vaccine for MMR', 'The MMR vaccine protects children against all three diseases and is given at 12 months of age. A second dose using MMRV vaccine is given at 18 months of age to also protect children from chickenpox. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensure they have had two doses of a measles containing vaccine.'],
    ['what is the name of the vaccine for MMR', 'Measles vaccine is a vaccine that is very effective at preventing measles. After one dose 85% of children nine months of age and 95% over twelve months of age are immune. Nearly all of those who do not develop immunity after a single dose develop it after a second dose. Measles mumps rubella vaccine (MMR-II); MMR vaccine is a live attenuated viral vaccine used to induce immunity against measles, mumps and rubella.'],
    ['what is the name of the vaccine for MMR', 'The name of the measles vaccination is MMR. This is a three-in-one vaccination to protect against measles, mumps and rubella. The measles vaccine is a living, weakened form of the natural  measles virus. To make certain vaccines, viruses are weakened by  a process called cell culture adaptatio … n.. Cell culture  adaptation changes the natural measles virus so that it behaves  differently once it enters the body.'],
    ['what is the name of the vaccine for MMR', 'When the first dose of measles, mumps, rubella, and varicella vaccines is administered at ages 48 months and older, use of MMRV vaccine generally is preferred over separate injections of MMR and varicella vaccines. However, the overall risk of febrile seizures is very low for both options (about 8 out of every 10,000 children vaccinated with MMRV vaccine when they are 12-23 months old, and about 4 out of every 10,000 children vaccinated with the MMR and varicella vaccines at the same visit when they are 12-23 months old).'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'what is the name of the vaccine for MMR',
    [
        'Measles, mumps and rubella (MMR) vaccines. The MMR vaccine also comes in combination with chickenpox (MMRV) for 18 month old children and contains small amounts of each of the viruses at a reduced strength and a small amount of the antibiotic neomycin. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensure they have had two doses of a measles containing vaccine.',
        'The MMR vaccine protects children against all three diseases and is given at 12 months of age. A second dose using MMRV vaccine is given at 18 months of age to also protect children from chickenpox. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensure they have had two doses of a measles containing vaccine.',
        'Measles vaccine is a vaccine that is very effective at preventing measles. After one dose 85% of children nine months of age and 95% over twelve months of age are immune. Nearly all of those who do not develop immunity after a single dose develop it after a second dose. Measles mumps rubella vaccine (MMR-II); MMR vaccine is a live attenuated viral vaccine used to induce immunity against measles, mumps and rubella.',
        'The name of the measles vaccination is MMR. This is a three-in-one vaccination to protect against measles, mumps and rubella. The measles vaccine is a living, weakened form of the natural  measles virus. To make certain vaccines, viruses are weakened by  a process called cell culture adaptatio … n.. Cell culture  adaptation changes the natural measles virus so that it behaves  differently once it enters the body.',
        'When the first dose of measles, mumps, rubella, and varicella vaccines is administered at ages 48 months and older, use of MMRV vaccine generally is preferred over separate injections of MMR and varicella vaccines. However, the overall risk of febrile seizures is very low for both options (about 8 out of every 10,000 children vaccinated with MMRV vaccine when they are 12-23 months old, and about 4 out of every 10,000 children vaccinated with the MMR and varicella vaccines at the same visit when they are 12-23 months old).',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Training Details

Training Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 78,704 training samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 11 characters
    • mean: 34.2 characters
    • max: 99 characters
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
  • Samples:
    query docs labels
    is leukemia genetic ["1 Certain genetic disorders, such as Down syndrome, are associated with an increased risk of leukemia. 2 Exposure to certain chemicals. 3 Exposure to certain chemicals, such as benzene — which is found in gasoline and is used by the chemical industry — also is linked to an increased risk of some kinds of leukemia. 4 Smoking. 1 Previous cancer treatment. 2 People who've had certain types of chemotherapy and radiation therapy for other cancers have an increased risk of developing certain types of leukemia. 3 Genetic disorders. 4 Genetic abnormalities seem to play a role in the development", 'Causes of Leukemia. Although researchers have studied the many cellular changes associated with leukemia, it is unknown why these changes occur. It is likely that certain risk factors are involved. Many factors (e.g., age, genetics) are unmodifiable (beyond control). It is now known that all cancers, including leukemia, begin as a mutation in the genetic material—the DNA (deoxyribonucleic aci... [1, 0, 0, 0, 0, ...]
    what does stratified seed mean ['Definition: Stratification is a means of simulating the chilling and warming that seeds would endure if left outdoors in their native climate, for the winter. Some seeds will stay dormant until triggered by a certain amount of time in cold temperature or warm, damp conditions.', 'Seeds of different species are stratified at different temperatures and for different periods of time. Some need periods of cold temperatures, between 34 and 41 degrees Fahrenheit, while some need warm stratification at 68 to 84 degrees. The ginkgo is a deciduous gymnosperm, more closely related to the pine than an oak tree. The ginkgo or maidenhair tree (Ginkgo biloba) is a living fossil. The earliest known fossil leaves of the ginkgo are 270 million years old.', 'In the wild, seed dormancy is usually overcome by the seed spending time in the ground through a winter period and having its hard seed coat softened up by frost and weathering action. By doing so the seed is undergoing a natural form of stratific... [1, 0, 0, 0, 0, ...]
    what is a calprotectin stool test tell ['Calprotectin is a stool (fecal) test that is used to detect inflammation in the intestines. Intestinal inflammation is associated with, for example, some bacterial infections and, in people with inflammatory bowel disease (IBD) , it is associated with disease activity and severity. The test may be ordered along with other stool tests, such as a stool culture to detect a bacterial infection, a stool white blood cell test, and/or a fecal occult blood test (FOBT) .', 'Faecal calprotectin is a substance that is released into the intestines in excess when there is any inflammation there. Its presence can mean a person has an inflammatory bowel disease such as Crohn’s disease or ulcerative colitis. These conditions can cause very similar symptoms to irritable bowel syndrome. ', "Faecal calprotectin is a biochemical measurement of calprotectin in the stool. Elevated faecal calprotectin indicates the migration of neutrophils to the intestinal mucosa, which occurs during intestinal inflammati... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Evaluation Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 1,000 evaluation samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 9 characters
    • mean: 33.96 characters
    • max: 110 characters
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
  • Samples:
    query docs labels
    what is the name of the vaccine for MMR ['Measles, mumps and rubella (MMR) vaccines. The MMR vaccine also comes in combination with chickenpox (MMRV) for 18 month old children and contains small amounts of each of the viruses at a reduced strength and a small amount of the antibiotic neomycin. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensure they have had two doses of a measles containing vaccine.', 'The MMR vaccine protects children against all three diseases and is given at 12 months of age. A second dose using MMRV vaccine is given at 18 months of age to also protect children from chickenpox. MMR is given at four years of age to children who did not get their second MMR vaccine at 18 months of age. The four year old MMR dose ends in December 2015. All people born during or since 1966 should check their immunisation status to ensur... [1, 0, 0, 0, 0, ...]
    what is the smallest shark in the world ['Tweet. The smallest shark in the world is the Dwarf Lanternshark; a dogfish that lives in the Caribbean Sea. They grow to a maximum of 17 centimeters and eat small fish and shrimp. The next smallest shark in the world is the Pygmy Shark that grows up to 25 centimeters long. These sharks are very closely related.', 'The dwarf lanternshark (Etmopterus perryi) is a little-known species of dogfish shark in the family Etmopteridae and possibly the smallest shark in the world, reaching a maximum known length of 21.2 cm (8.3 in). Perhaps the smallest living shark species, male dwarf lanternsharks mature at a length of 16–17.5 cm (6.3–6.9 in) and females from a length of 15.5 cm (6.1 in) with 19–20 cm (7.5–7.9 in) long pregnant individuals known.', 'The Pale Catshark at 8.27 is the smallest shark in the world. The Pale Catshark is a rare cat shark of the Scyliorhinidae family and only specimen found on the Makassar Strait. ', "The world's biggest school of sharks was observed in Galapagos, i... [1, 0, 0, 0, 0, ...]
    how much does nicu cost per day ['According to a 2010 article in Managed Care Magazine, the average cost for infants in NICU is around $3,000 per day. This average does not include surgeries or helicopter transports.', '1 The website Ncbi.nlm.nih.gov provides information regarding the NICU; the cost according to them can reach $1,800 to $2,500 per night of care. 2 ASPE.gov is another site that provides information regarding NICU costs.', "My son was in the NICU for 11 days and the cost was about $80,000--if I remember right, that also includes the cost of the c-section and my hospital stay. He was not in an incubator, and I don't know how much more the charge for an incubator would be--probably minimal.", "MrsDeLaVara:Our LO 'cost' about $350,000 for 62 days in the hospital. She was in the high level care facility for about half of the time, which cost more. That includes my delivery too since it was at the same hospital. Her NICU doctors billed $1500 a day for the high level and about $700 for the step down nursery... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 5
  • seed: 12
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0002 1 2.4083 -
0.0203 100 2.0981 2.0954
0.0407 200 2.0854 2.0919
0.0610 300 2.0847 2.0912
0.0813 400 2.0849 2.0870
0.1016 500 2.0767 2.0875
0.1220 600 2.0827 2.0849
0.1423 700 2.0727 2.0850
0.1626 800 2.0715 2.0846
0.1830 900 2.074 2.0839
0.2033 1000 2.0749 2.0831
0.2236 1100 2.0714 2.0832
0.2440 1200 2.0775 2.0825
0.2643 1300 2.0691 2.0830
0.2846 1400 2.0648 2.0820
0.3049 1500 2.0746 2.0821
0.3253 1600 2.074 2.0821
0.3456 1700 2.077 2.0834
0.3659 1800 2.085 2.0812
0.3863 1900 2.073 2.0817
0.4066 2000 2.0781 2.0810
0.4269 2100 2.0805 2.0819
0.4472 2200 2.0735 2.0813
0.4676 2300 2.067 2.0816
0.4879 2400 2.0705 2.0813
0.5082 2500 2.0738 2.0813
0.5286 2600 2.0711 2.0811
0.5489 2700 2.0672 2.0808
0.5692 2800 2.0721 2.0807
0.5896 2900 2.0757 2.0805
0.6099 3000 2.0611 2.0808
0.6302 3100 2.0705 2.0800
0.6505 3200 2.0614 2.0802
0.6709 3300 2.0613 2.0806
0.6912 3400 2.0691 2.0806
0.7115 3500 2.0681 2.0802
0.7319 3600 2.0714 2.0798
0.7522 3700 2.0661 2.0800
0.7725 3800 2.0736 2.0808
0.7928 3900 2.071 2.0806
0.8132 4000 2.0749 2.0796
0.8335 4100 2.0672 2.0799
0.8538 4200 2.0699 2.0801
0.8742 4300 2.0803 2.0799
0.8945 4400 2.0716 2.0803
0.9148 4500 2.0754 2.0797
0.9351 4600 2.057 2.0798
0.9555 4700 2.06 2.0797
0.9758 4800 2.0673 2.0804
0.9961 4900 2.0687 2.0799
1.0165 5000 2.0666 2.0818
1.0368 5100 2.0648 2.0817
1.0571 5200 2.0606 2.0817
1.0775 5300 2.0606 2.0838
1.0978 5400 2.0645 2.0801
1.1181 5500 2.0628 2.0834
1.1384 5600 2.061 2.0811
1.1588 5700 2.0659 2.0807
1.1791 5800 2.0729 2.0813
1.1994 5900 2.0564 2.0808
1.2198 6000 2.0681 2.0805
1.2401 6100 2.0611 2.0800
1.2604 6200 2.0616 2.0805
1.2807 6300 2.0557 2.0807
1.3011 6400 2.0585 2.0802
1.3214 6500 2.0541 2.0807
1.3417 6600 2.0645 2.0813
1.3621 6700 2.0612 2.0804
1.3824 6800 2.0741 2.0804
1.4027 6900 2.0628 2.0802
1.4231 7000 2.0563 2.0808
1.4434 7100 2.0689 2.0804
1.4637 7200 2.0711 2.0812
1.4840 7300 2.0547 2.0802
1.5044 7400 2.0541 2.0810
1.5247 7500 2.0539 2.0811
1.5450 7600 2.0716 2.0803
1.5654 7700 2.0618 2.0797
1.5857 7800 2.0564 2.0801
1.6060 7900 2.0678 2.0801
1.6263 8000 2.0683 2.0803
1.6467 8100 2.07 2.0800
1.6670 8200 2.0678 2.0803
1.6873 8300 2.0646 2.0796
1.7077 8400 2.062 2.0802
1.7280 8500 2.0698 2.0804
1.7483 8600 2.0717 2.0809
1.7687 8700 2.0723 2.0803
1.7890 8800 2.0603 2.0809
1.8093 8900 2.0597 2.0808
1.8296 9000 2.0596 2.0811
1.8500 9100 2.0719 2.0812
1.8703 9200 2.0647 2.0804
1.8906 9300 2.0675 2.0802
1.9110 9400 2.0676 2.0806
1.9313 9500 2.0649 2.0808
1.9516 9600 2.0676 2.0815
1.9719 9700 2.0696 2.0803
1.9923 9800 2.0679 2.0808
2.0126 9900 2.0557 2.0840
2.0329 10000 2.0491 2.0830
2.0533 10100 2.0559 2.0871
2.0736 10200 2.0549 2.0854
2.0939 10300 2.0488 2.0856
2.1143 10400 2.0371 2.0879
2.1346 10500 2.0427 2.0855
2.1549 10600 2.0397 2.0842
2.1752 10700 2.0496 2.0870
2.1956 10800 2.0496 2.0860
2.2159 10900 2.0485 2.0863
2.2362 11000 2.0577 2.0845
2.2566 11100 2.0532 2.0851
2.2769 11200 2.052 2.0876
2.2972 11300 2.05 2.0873
2.3175 11400 2.0505 2.0845
2.3379 11500 2.0535 2.0860
2.3582 11600 2.0524 2.0861
2.3785 11700 2.0532 2.0858
2.3989 11800 2.0566 2.0856
2.4192 11900 2.0445 2.0862
2.4395 12000 2.0461 2.0851
2.4598 12100 2.0471 2.0852
2.4802 12200 2.0421 2.0867
2.5005 12300 2.0424 2.0854
2.5208 12400 2.0469 2.0853
2.5412 12500 2.0449 2.0855
2.5615 12600 2.0504 2.0873
2.5818 12700 2.0548 2.0856
2.6022 12800 2.0529 2.0850
2.6225 12900 2.0454 2.0858
2.6428 13000 2.0468 2.0856
2.6631 13100 2.0543 2.0869
2.6835 13200 2.0528 2.0855
2.7038 13300 2.0503 2.0863
2.7241 13400 2.0527 2.0867
2.7445 13500 2.0481 2.0852
2.7648 13600 2.0435 2.0877
2.7851 13700 2.0502 2.0875
2.8054 13800 2.052 2.0904
2.8258 13900 2.0514 2.0868
2.8461 14000 2.0476 2.0862
2.8664 14100 2.0527 2.0869
2.8868 14200 2.0503 2.0867
2.9071 14300 2.0441 2.0859
2.9274 14400 2.0474 2.0863
2.9478 14500 2.0447 2.0857
2.9681 14600 2.0515 2.0893
2.9884 14700 2.0463 2.0873
3.0087 14800 2.0458 2.0881
3.0291 14900 2.0301 2.0907
3.0494 15000 2.0343 2.0907
3.0697 15100 2.0265 2.0907
3.0901 15200 2.0338 2.0941
3.1104 15300 2.037 2.0943
3.1307 15400 2.0355 2.0954
3.1510 15500 2.0295 2.0935
3.1714 15600 2.0244 2.0936
3.1917 15700 2.0288 2.0925
3.2120 15800 2.0371 2.0915
3.2324 15900 2.0247 2.0934
3.2527 16000 2.0224 2.0929
3.2730 16100 2.0211 2.0920
3.2934 16200 2.0347 2.0936
3.3137 16300 2.0301 2.0941
3.3340 16400 2.0359 2.0955
3.3543 16500 2.0327 2.0935
3.3747 16600 2.0277 2.0935
3.3950 16700 2.0329 2.0931
3.4153 16800 2.0309 2.0958
3.4357 16900 2.0217 2.0940
3.4560 17000 2.0293 2.0942
3.4763 17100 2.0398 2.0927
3.4966 17200 2.0385 2.0951
3.5170 17300 2.0318 2.0924
3.5373 17400 2.0425 2.0968
3.5576 17500 2.0277 2.0936
3.5780 17600 2.0273 2.0942
3.5983 17700 2.0405 2.0926
3.6186 17800 2.03 2.0957
3.6390 17900 2.0198 2.0939
3.6593 18000 2.0397 2.0954
3.6796 18100 2.0378 2.0927
3.6999 18200 2.0262 2.0926
3.7203 18300 2.0321 2.0971
3.7406 18400 2.0203 2.0933
3.7609 18500 2.0473 2.0932
3.7813 18600 2.0375 2.0928
3.8016 18700 2.03 2.0955
3.8219 18800 2.0294 2.0939
3.8422 18900 2.0264 2.0930
3.8626 19000 2.0245 2.0929
3.8829 19100 2.0385 2.0925
3.9032 19200 2.0276 2.0936
3.9236 19300 2.0421 2.0928
3.9439 19400 2.0274 2.0923
3.9642 19500 2.0303 2.0940
3.9845 19600 2.0276 2.0947
4.0049 19700 2.0284 2.0941
4.0252 19800 2.0202 2.0969
4.0455 19900 2.0191 2.0955
4.0659 20000 2.0192 2.0992
4.0862 20100 2.0175 2.0964
4.1065 20200 2.021 2.0966
4.1269 20300 2.02 2.0980
4.1472 20400 2.0118 2.0983
4.1675 20500 2.02 2.0980
4.1878 20600 2.0239 2.0968
4.2082 20700 2.0148 2.0978
4.2285 20800 2.0296 2.0960
4.2488 20900 2.0078 2.0981
4.2692 21000 2.0182 2.0972
4.2895 21100 2.0184 2.0979
4.3098 21200 2.0242 2.0978
4.3301 21300 2.023 2.0969
4.3505 21400 2.0218 2.0977
4.3708 21500 2.0199 2.0984
4.3911 21600 2.0193 2.0979
4.4115 21700 2.0195 2.0983
4.4318 21800 2.0107 2.0990
4.4521 21900 2.0194 2.0973
4.4725 22000 2.022 2.0981
4.4928 22100 2.017 2.0979
4.5131 22200 2.0265 2.0991
4.5334 22300 2.0188 2.0988
4.5538 22400 2.0312 2.0979
4.5741 22500 2.0177 2.0978
4.5944 22600 2.0284 2.0982
4.6148 22700 2.0179 2.0975
4.6351 22800 2.0227 2.0960
4.6554 22900 2.0258 2.0974
4.6757 23000 2.0226 2.0974
4.6961 23100 2.0219 2.0970
4.7164 23200 2.0156 2.0974
4.7367 23300 2.0221 2.0981
4.7571 23400 2.0238 2.0982
4.7774 23500 2.0148 2.0968
4.7977 23600 2.0225 2.0983
4.8181 23700 2.0196 2.0986
4.8384 23800 2.0192 2.0987
4.8587 23900 2.0278 2.0978
4.8790 24000 2.0177 2.0982
4.8994 24100 2.0139 2.0976
4.9197 24200 2.0265 2.0984
4.9400 24300 2.0124 2.0980
4.9604 24400 2.0119 2.0978
4.9807 24500 2.0204 2.0980
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 5.0.0
  • Transformers: 4.51.0
  • PyTorch: 2.9.1+cu126
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.4-dev.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ListNetLoss

@inproceedings{cao2007learning,
    title={Learning to Rank: From Pairwise Approach to Listwise Approach},
    author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
    booktitle={Proceedings of the 24th international conference on Machine learning},
    pages={129--136},
    year={2007}
}
Downloads last month
3
Safetensors
Model size
68.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bansalaman18/reranker-msmarco-v1.1-ettin-encoder-68m-listnet

Finetuned
(16)
this model

Dataset used to train bansalaman18/reranker-msmarco-v1.1-ettin-encoder-68m-listnet

Paper for bansalaman18/reranker-msmarco-v1.1-ettin-encoder-68m-listnet