--- tags: - sentence-transformers - cross-encoder - reranker - generated_from_trainer - dataset_size:7419 - loss:BinaryCrossEntropyLoss base_model: cross-encoder/nli-deberta-v3-base pipeline_tag: text-ranking library_name: sentence-transformers metrics: - accuracy - accuracy_threshold - f1 - f1_threshold - precision - recall - average_precision model-index: - name: CrossEncoder based on cross-encoder/nli-deberta-v3-base results: - task: type: cross-encoder-classification name: Cross Encoder Classification dataset: name: ce val type: ce-val metrics: - type: accuracy value: 0.6036363636363636 name: Accuracy - type: accuracy_threshold value: 0.5116937160491943 name: Accuracy Threshold - type: f1 value: 0.6751269035532994 name: F1 - type: f1_threshold value: 0.4685322642326355 name: F1 Threshold - type: precision value: 0.5188556566970091 name: Precision - type: recall value: 0.9661016949152542 name: Recall - type: average_precision value: 0.5805485796313634 name: Average Precision --- # CrossEncoder based on cross-encoder/nli-deberta-v3-base This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search. ## Model Details ### Model Description - **Model Type:** Cross Encoder - **Base model:** [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) - **Maximum Sequence Length:** 256 tokens - **Number of Output Labels:** 1 label - **Supported Modality:** Text ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Documentation:** [Cross Encoder Documentation](https://www.sbert.net/docs/cross_encoder/usage/usage.html) - **Repository:** [Sentence Transformers on GitHub](https://github.com/huggingface/sentence-transformers) - **Hugging Face:** [Cross Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=cross-encoder) ### Full Model Architecture ``` CrossEncoder( (0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': 'DebertaV2ForSequenceClassification'}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import CrossEncoder # Download from the 🤗 Hub model = CrossEncoder("cross_encoder_model_id") # Get scores for pairs of inputs pairs = [ ['The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.'], ['[S]unspot activity on the surface of our star has dropped to a new low.', 'It has a regular activity cycle of starspots.'], ['More money is dedicated within the Department of Homeland Security to climate change than what\'s spent combating "Islamist terrorists radicalizing over the Internet in the United States of America."', 'Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America\'s vulnerability to terrorism, and minimize the damage and recover from attacks that do occur".'], ['Worst-case global heating scenarios may need to be revised upwards in light of a better understanding of the role of clouds, scientists have said.', 'Results from the CERES and other NASA missions, such as the Earth Radiation Budget Experiment (ERBE), could lead to a better understanding of the role of clouds and the energy cycle in global climate change.'], ['Prof Adam Scaife, a climate modelling expert at the UK’s Met Office, said the evidence for a link to shrinking Arctic ice was now good: ‘The consensus points towards that being a real effect.’”', 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.'], ] scores = model.predict(pairs) print(scores) # [0.5664 0.4765 0.5621 0.5187 0.4973] # Or rank different texts based on similarity to a single text ranks = model.rank( 'The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher.', [ 'Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates.', 'It has a regular activity cycle of starspots.', 'Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America\'s vulnerability to terrorism, and minimize the damage and recover from attacks that do occur".', 'Results from the CERES and other NASA missions, such as the Earth Radiation Budget Experiment (ERBE), could lead to a better understanding of the role of clouds and the energy cycle in global climate change.', 'Some models of modern climate exhibit Arctic amplification without changes in snow and ice cover.', ] ) # [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...] ``` ## Evaluation ### Metrics #### Cross Encoder Classification * Dataset: `ce-val` * Evaluated with [CrossEncoderClassificationEvaluator](https://sbert.net/docs/package_reference/cross_encoder/evaluation.html#sentence_transformers.cross_encoder.evaluation.CrossEncoderClassificationEvaluator) | Metric | Value | |:----------------------|:-----------| | accuracy | 0.6036 | | accuracy_threshold | 0.5117 | | f1 | 0.6751 | | f1_threshold | 0.4685 | | precision | 0.5189 | | recall | 0.9661 | | **average_precision** | **0.5805** | ## Training Details ### Training Dataset #### Unnamed Dataset * Size: 7,419 training samples * Columns: sentence_0, sentence_1, and label * Approximate statistics based on the first 1000 samples: | | sentence_0 | sentence_1 | label | |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------| | type | string | string | float | | details | | | | * Samples: | sentence_0 | sentence_1 | label | |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------| | The last time the planet was even four degrees warmer, Peter Brannen points out in The Ends of the World, his new history of the planet’s major extinction events, the oceans were hundreds of feet higher. | Almost all scientists acknowledge that the rate of species loss is greater now than at any time in human history, with extinctions occurring at rates hundreds of times higher than background extinction rates. | 0.0 | | [S]unspot activity on the surface of our star has dropped to a new low. | It has a regular activity cycle of starspots. | 1.0 | | More money is dedicated within the Department of Homeland Security to climate change than what's spent combating "Islamist terrorists radicalizing over the Internet in the United States of America." | Homeland security is officially defined by the National Strategy for Homeland Security as "a concerted national effort to prevent terrorist attacks within the United States, reduce America's vulnerability to terrorism, and minimize the damage and recover from attacks that do occur". | 1.0 | * Loss: [BinaryCrossEntropyLoss](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters: ```json { "activation_fn": "torch.nn.modules.linear.Identity", "pos_weight": null } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `num_train_epochs`: 1 - `fp16`: True #### All Hyperparameters
Click to expand - `do_predict`: False - `prediction_loss_only`: True - `per_device_train_batch_size`: 16 - `per_device_eval_batch_size`: 16 - `gradient_accumulation_steps`: 1 - `eval_accumulation_steps`: None - `torch_empty_cache_steps`: None - `learning_rate`: 5e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1 - `num_train_epochs`: 1 - `max_steps`: -1 - `lr_scheduler_type`: linear - `lr_scheduler_kwargs`: None - `warmup_ratio`: None - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `enable_jit_checkpoint`: False - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `use_cpu`: False - `seed`: 42 - `data_seed`: None - `bf16`: False - `fp16`: True - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: -1 - `ddp_backend`: None - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: False - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `parallelism_config`: None - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `group_by_length`: False - `length_column_name`: length - `project`: huggingface - `trackio_space_id`: trackio - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: None - `hub_always_push`: False - `hub_revision`: None - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_for_metrics`: [] - `eval_do_concat_batches`: True - `auto_find_batch_size`: False - `full_determinism`: False - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `include_num_input_tokens_seen`: no - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `eval_on_start`: False - `use_liger_kernel`: False - `liger_kernel_config`: None - `eval_use_gather_object`: False - `average_tokens_across_devices`: True - `use_cache`: False - `prompts`: None - `batch_sampler`: batch_sampler - `multi_dataset_batch_sampler`: proportional - `router_mapping`: {} - `learning_rate_mapping`: {}
### Training Logs | Epoch | Step | ce-val_average_precision | |:-----:|:----:|:------------------------:| | -1 | -1 | 0.5805 | ### Training Time - **Training**: 2.2 minutes ### Framework Versions - Python: 3.12.13 - Sentence Transformers: 5.4.1 - Transformers: 5.0.0 - PyTorch: 2.10.0+cu128 - Accelerate: 1.13.0 - Datasets: 4.0.0 - Tokenizers: 0.22.2 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ```