Text Ranking
sentence-transformers
Safetensors
deberta-v2
cross-encoder
reranker
Generated from Trainer
dataset_size:7419
loss:BinaryCrossEntropyLoss
Eval Results (legacy)
text-embeddings-inference
Instructions to use ColeH0415/comp90042-crossencoder-factcheck with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- sentence-transformers
How to use ColeH0415/comp90042-crossencoder-factcheck with sentence-transformers:
from sentence_transformers import CrossEncoder model = CrossEncoder("ColeH0415/comp90042-crossencoder-factcheck") query = "Which planet is known as the Red Planet?" passages = [ "Venus is often called Earth's twin because of its similar size and proximity.", "Mars, known for its reddish appearance, is often referred to as the Red Planet.", "Jupiter, the largest planet in our solar system, has a prominent red spot.", "Saturn, famous for its rings, is sometimes mistaken for the Red Planet." ] scores = model.predict([(query, passage) for passage in passages]) print(scores) - Notebooks
- Google Colab
- Kaggle
CE fine-tuned epoch 1/3 best_val=0.5636
Browse files- README.md +52 -48
- config.json +24 -15
- config_sentence_transformers.json +1 -1
- model.safetensors +2 -2
- tokenizer.json +0 -0
- tokenizer_config.json +12 -9
README.md
CHANGED
|
@@ -6,7 +6,7 @@ tags:
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:7419
|
| 8 |
- loss:BinaryCrossEntropyLoss
|
| 9 |
-
base_model: cross-encoder/
|
| 10 |
pipeline_tag: text-ranking
|
| 11 |
library_name: sentence-transformers
|
| 12 |
metrics:
|
|
@@ -18,7 +18,7 @@ metrics:
|
|
| 18 |
- recall
|
| 19 |
- average_precision
|
| 20 |
model-index:
|
| 21 |
-
- name: CrossEncoder based on cross-encoder/
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
type: cross-encoder-classification
|
|
@@ -28,38 +28,38 @@ model-index:
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
-
value: 0.
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
-
value: 0.
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
-
value: 0.
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
-
value:
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
-
value: 0.
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
-
value: 0.
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
-
value: 0.
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
| 53 |
-
# CrossEncoder based on cross-encoder/
|
| 54 |
|
| 55 |
-
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/
|
| 56 |
|
| 57 |
## Model Details
|
| 58 |
|
| 59 |
### Model Description
|
| 60 |
- **Model Type:** Cross Encoder
|
| 61 |
-
- **Base model:** [cross-encoder/
|
| 62 |
-
- **Maximum Sequence Length:**
|
| 63 |
- **Number of Output Labels:** 1 label
|
| 64 |
- **Supported Modality:** Text
|
| 65 |
<!-- - **Training Dataset:** Unknown -->
|
|
@@ -77,7 +77,7 @@ This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.h
|
|
| 77 |
|
| 78 |
```
|
| 79 |
CrossEncoder(
|
| 80 |
-
(0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': '
|
| 81 |
)
|
| 82 |
```
|
| 83 |
|
|
@@ -99,25 +99,25 @@ from sentence_transformers import CrossEncoder
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
-
['
|
| 103 |
-
['
|
| 104 |
-
['
|
| 105 |
-
['
|
| 106 |
-
['
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
-
# [
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
-
'
|
| 115 |
[
|
| 116 |
-
'
|
| 117 |
-
|
| 118 |
-
|
| 119 |
-
'
|
| 120 |
-
'
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
@@ -158,13 +158,13 @@ You can finetune this model on your own dataset.
|
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
-
| accuracy | 0.
|
| 162 |
-
| accuracy_threshold | 0.
|
| 163 |
-
| f1 | 0.
|
| 164 |
-
| f1_threshold |
|
| 165 |
-
| precision | 0.
|
| 166 |
-
| recall | 0.
|
| 167 |
-
| **average_precision** | **0.
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
@@ -190,13 +190,13 @@ You can finetune this model on your own dataset.
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
-
| details | <ul><li>min: 7 tokens</li><li>mean:
|
| 194 |
* Samples:
|
| 195 |
-
| sentence_0
|
| 196 |
-
|:---------------------------------------------------------------------------------------------------------------------------------------------------
|
| 197 |
-
| <code>
|
| 198 |
-
| <code>
|
| 199 |
-
| <code>
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
@@ -208,17 +208,18 @@ You can finetune this model on your own dataset.
|
|
| 208 |
### Training Hyperparameters
|
| 209 |
#### Non-Default Hyperparameters
|
| 210 |
|
| 211 |
-
- `per_device_train_batch_size`:
|
| 212 |
-
- `per_device_eval_batch_size`:
|
| 213 |
- `num_train_epochs`: 1
|
|
|
|
| 214 |
|
| 215 |
#### All Hyperparameters
|
| 216 |
<details><summary>Click to expand</summary>
|
| 217 |
|
| 218 |
- `do_predict`: False
|
| 219 |
- `prediction_loss_only`: True
|
| 220 |
-
- `per_device_train_batch_size`:
|
| 221 |
-
- `per_device_eval_batch_size`:
|
| 222 |
- `gradient_accumulation_steps`: 1
|
| 223 |
- `eval_accumulation_steps`: None
|
| 224 |
- `torch_empty_cache_steps`: None
|
|
@@ -246,7 +247,7 @@ You can finetune this model on your own dataset.
|
|
| 246 |
- `seed`: 42
|
| 247 |
- `data_seed`: None
|
| 248 |
- `bf16`: False
|
| 249 |
-
- `fp16`:
|
| 250 |
- `bf16_full_eval`: False
|
| 251 |
- `fp16_full_eval`: False
|
| 252 |
- `tf32`: None
|
|
@@ -315,13 +316,16 @@ You can finetune this model on your own dataset.
|
|
| 315 |
</details>
|
| 316 |
|
| 317 |
### Training Logs
|
| 318 |
-
| Epoch
|
| 319 |
-
|:-----:|:----:|:------------------------:|
|
| 320 |
-
|
|
|
|
|
|
|
|
|
|
|
| 321 |
|
| 322 |
|
| 323 |
### Training Time
|
| 324 |
-
- **Training**:
|
| 325 |
|
| 326 |
### Framework Versions
|
| 327 |
- Python: 3.12.13
|
|
|
|
| 6 |
- generated_from_trainer
|
| 7 |
- dataset_size:7419
|
| 8 |
- loss:BinaryCrossEntropyLoss
|
| 9 |
+
base_model: cross-encoder/nli-deberta-v3-base
|
| 10 |
pipeline_tag: text-ranking
|
| 11 |
library_name: sentence-transformers
|
| 12 |
metrics:
|
|
|
|
| 18 |
- recall
|
| 19 |
- average_precision
|
| 20 |
model-index:
|
| 21 |
+
- name: CrossEncoder based on cross-encoder/nli-deberta-v3-base
|
| 22 |
results:
|
| 23 |
- task:
|
| 24 |
type: cross-encoder-classification
|
|
|
|
| 28 |
type: ce-val
|
| 29 |
metrics:
|
| 30 |
- type: accuracy
|
| 31 |
+
value: 0.5636363636363636
|
| 32 |
name: Accuracy
|
| 33 |
- type: accuracy_threshold
|
| 34 |
+
value: 0.512444794178009
|
| 35 |
name: Accuracy Threshold
|
| 36 |
- type: f1
|
| 37 |
+
value: 0.6711297071129707
|
| 38 |
name: F1
|
| 39 |
- type: f1_threshold
|
| 40 |
+
value: 0.4493926167488098
|
| 41 |
name: F1 Threshold
|
| 42 |
- type: precision
|
| 43 |
+
value: 0.5127877237851662
|
| 44 |
name: Precision
|
| 45 |
- type: recall
|
| 46 |
+
value: 0.9709443099273608
|
| 47 |
name: Recall
|
| 48 |
- type: average_precision
|
| 49 |
+
value: 0.565654586279988
|
| 50 |
name: Average Precision
|
| 51 |
---
|
| 52 |
|
| 53 |
+
# CrossEncoder based on cross-encoder/nli-deberta-v3-base
|
| 54 |
|
| 55 |
+
This is a [Cross Encoder](https://www.sbert.net/docs/cross_encoder/usage/usage.html) model finetuned from [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) using the [sentence-transformers](https://www.SBERT.net) library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
|
| 56 |
|
| 57 |
## Model Details
|
| 58 |
|
| 59 |
### Model Description
|
| 60 |
- **Model Type:** Cross Encoder
|
| 61 |
+
- **Base model:** [cross-encoder/nli-deberta-v3-base](https://huggingface.co/cross-encoder/nli-deberta-v3-base) <!-- at revision 6c749ce3425cd33b46d187e45b92bbf96ee12ec7 -->
|
| 62 |
+
- **Maximum Sequence Length:** 256 tokens
|
| 63 |
- **Number of Output Labels:** 1 label
|
| 64 |
- **Supported Modality:** Text
|
| 65 |
<!-- - **Training Dataset:** Unknown -->
|
|
|
|
| 77 |
|
| 78 |
```
|
| 79 |
CrossEncoder(
|
| 80 |
+
(0): Transformer({'transformer_task': 'sequence-classification', 'modality_config': {'text': {'method': 'forward', 'method_output_name': 'logits'}}, 'module_output_name': 'scores', 'architecture': 'DebertaV2ForSequenceClassification'})
|
| 81 |
)
|
| 82 |
```
|
| 83 |
|
|
|
|
| 99 |
model = CrossEncoder("cross_encoder_model_id")
|
| 100 |
# Get scores for pairs of inputs
|
| 101 |
pairs = [
|
| 102 |
+
['An independent inquiry found CRU is a small research unit with limited resources and their rigour and honesty are not in doubt.', 'The media and other scientific organisations were criticised for having "sometimes neglected" to reflect the uncertainties, doubts and assumptions of the work done by the CRU.'],
|
| 103 |
+
['As president, Obama will immediately close the Mississippi River Gulf Outlet, which experts say funneled floodwater into New Orleans.', 'Levees along the MRGO and the Intracoastal Waterway were breached in approximately 20 places, directly flooding most of St. Bernard Parish and New Orleans East.'],
|
| 104 |
+
['If we double atmospheric carbon dioxide[…] we’d only raise global surface temperatures by about a degree Celsius.', 'Not only do increasing carbon dioxide concentrations lead to increases in global surface temperature, but increasing global temperatures also cause increasing concentrations of carbon dioxide.'],
|
| 105 |
+
['But as that upper layer warms up, the oxygen-rich waters are less likely to mix down into cooler layers of the ocean because the warm waters are less dense and do not sink as readily.', 'Water that is saltier or cooler will be denser, and will sink in relation to the surrounding water.'],
|
| 106 |
+
['Less than half of published scientists endorse global warming.', 'Scientists Reach 100% Consensus on Anthropogenic Global Warming.'],
|
| 107 |
]
|
| 108 |
scores = model.predict(pairs)
|
| 109 |
print(scores)
|
| 110 |
+
# [0.5286 0.4566 0.49 0.5111 0.6522]
|
| 111 |
|
| 112 |
# Or rank different texts based on similarity to a single text
|
| 113 |
ranks = model.rank(
|
| 114 |
+
'An independent inquiry found CRU is a small research unit with limited resources and their rigour and honesty are not in doubt.',
|
| 115 |
[
|
| 116 |
+
'The media and other scientific organisations were criticised for having "sometimes neglected" to reflect the uncertainties, doubts and assumptions of the work done by the CRU.',
|
| 117 |
+
'Levees along the MRGO and the Intracoastal Waterway were breached in approximately 20 places, directly flooding most of St. Bernard Parish and New Orleans East.',
|
| 118 |
+
'Not only do increasing carbon dioxide concentrations lead to increases in global surface temperature, but increasing global temperatures also cause increasing concentrations of carbon dioxide.',
|
| 119 |
+
'Water that is saltier or cooler will be denser, and will sink in relation to the surrounding water.',
|
| 120 |
+
'Scientists Reach 100% Consensus on Anthropogenic Global Warming.',
|
| 121 |
]
|
| 122 |
)
|
| 123 |
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
|
|
|
|
| 158 |
|
| 159 |
| Metric | Value |
|
| 160 |
|:----------------------|:-----------|
|
| 161 |
+
| accuracy | 0.5636 |
|
| 162 |
+
| accuracy_threshold | 0.5124 |
|
| 163 |
+
| f1 | 0.6711 |
|
| 164 |
+
| f1_threshold | 0.4494 |
|
| 165 |
+
| precision | 0.5128 |
|
| 166 |
+
| recall | 0.9709 |
|
| 167 |
+
| **average_precision** | **0.5657** |
|
| 168 |
|
| 169 |
<!--
|
| 170 |
## Bias, Risks and Limitations
|
|
|
|
| 190 |
| | sentence_0 | sentence_1 | label |
|
| 191 |
|:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------|:---------------------------------------------------------------|
|
| 192 |
| type | string | string | float |
|
| 193 |
+
| details | <ul><li>min: 7 tokens</li><li>mean: 26.73 tokens</li><li>max: 66 tokens</li></ul> | <ul><li>min: 6 tokens</li><li>mean: 31.55 tokens</li><li>max: 133 tokens</li></ul> | <ul><li>min: 0.0</li><li>mean: 0.53</li><li>max: 1.0</li></ul> |
|
| 194 |
* Samples:
|
| 195 |
+
| sentence_0 | sentence_1 | label |
|
| 196 |
+
|:---------------------------------------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------|
|
| 197 |
+
| <code>An independent inquiry found CRU is a small research unit with limited resources and their rigour and honesty are not in doubt.</code> | <code>The media and other scientific organisations were criticised for having "sometimes neglected" to reflect the uncertainties, doubts and assumptions of the work done by the CRU.</code> | <code>0.0</code> |
|
| 198 |
+
| <code>As president, Obama will immediately close the Mississippi River Gulf Outlet, which experts say funneled floodwater into New Orleans.</code> | <code>Levees along the MRGO and the Intracoastal Waterway were breached in approximately 20 places, directly flooding most of St. Bernard Parish and New Orleans East.</code> | <code>1.0</code> |
|
| 199 |
+
| <code>If we double atmospheric carbon dioxide[…] we’d only raise global surface temperatures by about a degree Celsius.</code> | <code>Not only do increasing carbon dioxide concentrations lead to increases in global surface temperature, but increasing global temperatures also cause increasing concentrations of carbon dioxide.</code> | <code>0.0</code> |
|
| 200 |
* Loss: [<code>BinaryCrossEntropyLoss</code>](https://sbert.net/docs/package_reference/cross_encoder/losses.html#binarycrossentropyloss) with these parameters:
|
| 201 |
```json
|
| 202 |
{
|
|
|
|
| 208 |
### Training Hyperparameters
|
| 209 |
#### Non-Default Hyperparameters
|
| 210 |
|
| 211 |
+
- `per_device_train_batch_size`: 4
|
| 212 |
+
- `per_device_eval_batch_size`: 4
|
| 213 |
- `num_train_epochs`: 1
|
| 214 |
+
- `fp16`: True
|
| 215 |
|
| 216 |
#### All Hyperparameters
|
| 217 |
<details><summary>Click to expand</summary>
|
| 218 |
|
| 219 |
- `do_predict`: False
|
| 220 |
- `prediction_loss_only`: True
|
| 221 |
+
- `per_device_train_batch_size`: 4
|
| 222 |
+
- `per_device_eval_batch_size`: 4
|
| 223 |
- `gradient_accumulation_steps`: 1
|
| 224 |
- `eval_accumulation_steps`: None
|
| 225 |
- `torch_empty_cache_steps`: None
|
|
|
|
| 247 |
- `seed`: 42
|
| 248 |
- `data_seed`: None
|
| 249 |
- `bf16`: False
|
| 250 |
+
- `fp16`: True
|
| 251 |
- `bf16_full_eval`: False
|
| 252 |
- `fp16_full_eval`: False
|
| 253 |
- `tf32`: None
|
|
|
|
| 316 |
</details>
|
| 317 |
|
| 318 |
### Training Logs
|
| 319 |
+
| Epoch | Step | Training Loss | ce-val_average_precision |
|
| 320 |
+
|:------:|:----:|:-------------:|:------------------------:|
|
| 321 |
+
| 0.2695 | 500 | 0.7103 | - |
|
| 322 |
+
| 0.5391 | 1000 | 0.6983 | - |
|
| 323 |
+
| 0.8086 | 1500 | 0.6982 | - |
|
| 324 |
+
| -1 | -1 | - | 0.5657 |
|
| 325 |
|
| 326 |
|
| 327 |
### Training Time
|
| 328 |
+
- **Training**: 7.1 minutes
|
| 329 |
|
| 330 |
### Framework Versions
|
| 331 |
- Python: 3.12.13
|
config.json
CHANGED
|
@@ -1,36 +1,45 @@
|
|
| 1 |
{
|
| 2 |
-
"add_cross_attention": false,
|
| 3 |
"architectures": [
|
| 4 |
-
"
|
| 5 |
],
|
| 6 |
"attention_probs_dropout_prob": 0.1,
|
| 7 |
-
"bos_token_id":
|
| 8 |
-
"classifier_dropout": null,
|
| 9 |
"dtype": "float32",
|
| 10 |
-
"eos_token_id":
|
| 11 |
-
"gradient_checkpointing": false,
|
| 12 |
"hidden_act": "gelu",
|
| 13 |
"hidden_dropout_prob": 0.1,
|
| 14 |
-
"hidden_size":
|
| 15 |
"id2label": {
|
| 16 |
"0": "LABEL_0"
|
| 17 |
},
|
| 18 |
"initializer_range": 0.02,
|
| 19 |
-
"intermediate_size":
|
| 20 |
-
"is_decoder": false,
|
| 21 |
"label2id": {
|
| 22 |
"LABEL_0": 0
|
| 23 |
},
|
| 24 |
-
"layer_norm_eps": 1e-
|
|
|
|
| 25 |
"max_position_embeddings": 512,
|
| 26 |
-
"
|
|
|
|
|
|
|
| 27 |
"num_attention_heads": 12,
|
| 28 |
-
"num_hidden_layers":
|
| 29 |
"pad_token_id": 0,
|
| 30 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 31 |
"tie_word_embeddings": true,
|
| 32 |
"transformers_version": "5.0.0",
|
| 33 |
-
"type_vocab_size":
|
| 34 |
"use_cache": false,
|
| 35 |
-
"vocab_size":
|
| 36 |
}
|
|
|
|
| 1 |
{
|
|
|
|
| 2 |
"architectures": [
|
| 3 |
+
"DebertaV2ForSequenceClassification"
|
| 4 |
],
|
| 5 |
"attention_probs_dropout_prob": 0.1,
|
| 6 |
+
"bos_token_id": 1,
|
|
|
|
| 7 |
"dtype": "float32",
|
| 8 |
+
"eos_token_id": 2,
|
|
|
|
| 9 |
"hidden_act": "gelu",
|
| 10 |
"hidden_dropout_prob": 0.1,
|
| 11 |
+
"hidden_size": 768,
|
| 12 |
"id2label": {
|
| 13 |
"0": "LABEL_0"
|
| 14 |
},
|
| 15 |
"initializer_range": 0.02,
|
| 16 |
+
"intermediate_size": 3072,
|
|
|
|
| 17 |
"label2id": {
|
| 18 |
"LABEL_0": 0
|
| 19 |
},
|
| 20 |
+
"layer_norm_eps": 1e-07,
|
| 21 |
+
"legacy": true,
|
| 22 |
"max_position_embeddings": 512,
|
| 23 |
+
"max_relative_positions": -1,
|
| 24 |
+
"model_type": "deberta-v2",
|
| 25 |
+
"norm_rel_ebd": "layer_norm",
|
| 26 |
"num_attention_heads": 12,
|
| 27 |
+
"num_hidden_layers": 12,
|
| 28 |
"pad_token_id": 0,
|
| 29 |
+
"pooler_dropout": 0,
|
| 30 |
+
"pooler_hidden_act": "gelu",
|
| 31 |
+
"pooler_hidden_size": 768,
|
| 32 |
+
"pos_att_type": [
|
| 33 |
+
"p2c",
|
| 34 |
+
"c2p"
|
| 35 |
+
],
|
| 36 |
+
"position_biased_input": false,
|
| 37 |
+
"position_buckets": 256,
|
| 38 |
+
"relative_attention": true,
|
| 39 |
+
"share_att_key": true,
|
| 40 |
"tie_word_embeddings": true,
|
| 41 |
"transformers_version": "5.0.0",
|
| 42 |
+
"type_vocab_size": 0,
|
| 43 |
"use_cache": false,
|
| 44 |
+
"vocab_size": 128100
|
| 45 |
}
|
config_sentence_transformers.json
CHANGED
|
@@ -4,7 +4,7 @@
|
|
| 4 |
"sentence_transformers": "5.4.1",
|
| 5 |
"transformers": "5.0.0"
|
| 6 |
},
|
| 7 |
-
"activation_fn": "torch.nn.modules.
|
| 8 |
"default_prompt_name": null,
|
| 9 |
"model_type": "CrossEncoder",
|
| 10 |
"prompts": {}
|
|
|
|
| 4 |
"sentence_transformers": "5.4.1",
|
| 5 |
"transformers": "5.0.0"
|
| 6 |
},
|
| 7 |
+
"activation_fn": "torch.nn.modules.activation.Sigmoid",
|
| 8 |
"default_prompt_name": null,
|
| 9 |
"model_type": "CrossEncoder",
|
| 10 |
"prompts": {}
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
-
size
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:922a849772df9d09c87547738e9d2e009866a84a654b460b49743b396fc81a74
|
| 3 |
+
size 737716172
|
tokenizer.json
CHANGED
|
The diff for this file is too large to render.
See raw diff
|
|
|
tokenizer_config.json
CHANGED
|
@@ -1,18 +1,21 @@
|
|
| 1 |
{
|
|
|
|
| 2 |
"backend": "tokenizers",
|
| 3 |
-
"
|
|
|
|
| 4 |
"cls_token": "[CLS]",
|
| 5 |
-
"
|
| 6 |
-
"
|
| 7 |
"is_local": false,
|
| 8 |
"mask_token": "[MASK]",
|
| 9 |
-
"model_max_length":
|
| 10 |
"model_specific_special_tokens": {},
|
| 11 |
-
"never_split": null,
|
| 12 |
"pad_token": "[PAD]",
|
| 13 |
"sep_token": "[SEP]",
|
| 14 |
-
"
|
| 15 |
-
"
|
| 16 |
-
"tokenizer_class": "
|
| 17 |
-
"
|
|
|
|
|
|
|
| 18 |
}
|
|
|
|
| 1 |
{
|
| 2 |
+
"add_prefix_space": true,
|
| 3 |
"backend": "tokenizers",
|
| 4 |
+
"bos_token": "[CLS]",
|
| 5 |
+
"clean_up_tokenization_spaces": false,
|
| 6 |
"cls_token": "[CLS]",
|
| 7 |
+
"do_lower_case": false,
|
| 8 |
+
"eos_token": "[SEP]",
|
| 9 |
"is_local": false,
|
| 10 |
"mask_token": "[MASK]",
|
| 11 |
+
"model_max_length": 256,
|
| 12 |
"model_specific_special_tokens": {},
|
|
|
|
| 13 |
"pad_token": "[PAD]",
|
| 14 |
"sep_token": "[SEP]",
|
| 15 |
+
"sp_model_kwargs": {},
|
| 16 |
+
"split_by_punct": false,
|
| 17 |
+
"tokenizer_class": "DebertaV2Tokenizer",
|
| 18 |
+
"unk_id": 3,
|
| 19 |
+
"unk_token": "[UNK]",
|
| 20 |
+
"vocab_type": "spm"
|
| 21 |
}
|