Matryoshka Representation Learning
Paper • 2205.13147 • Published • 25
This is a sentence-transformers model finetuned from nomic-ai/modernbert-embed-base on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: ModernBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("legaltextai/modernbert-embed-base-legaltextai-matryoshka-legaldataset")
# Run inference
sentences = [
'In the context of supporting factual positions in a legal motion, what are the two primary ways a party can assert that a fact cannot be genuinely disputed according to the procedures outlined in section (c)(1)?',
'(c) Procedures.\n\n(1) Supporting Factual Positions. A party asserting that a fact cannot be or is\xa0genuinely disputed must support the assertion by:\n\n(A) citing to particular parts of materials in the record, including depositions,\xa0documents, electronically stored information, affidavits or declarations,\xa0stipulations (including those made for purposes of the motion only), admissions,\xa0interrogatory answers, or other materials; or\n\n(B) showing that the materials cited do not establish the absence or presence of a\xa0genuine dispute, or that an adverse party cannot produce admissible evidence to\xa0support the fact.\n\n(2) Objection That a Fact Is Not Supported by Admissible Evidence. A party may\xa0object that the material cited to support or dispute a fact cannot be presented in a\xa0form that would be admissible in evidence.\n\n(3) Materials Not Cited. The court need consider only the cited materials, but it\xa0may consider other materials in the record.\n\n(4) Affidavits or Declarations. An affidavit or declaration used to support or oppose\xa0a motion must be made on personal knowledge, set out facts that would be admissible\xa0in evidence, and show that the affiant or declarant is competent to testify on the\xa0matters stated.\n\n(d) When Facts are Unavailable to the Nonmovant. If a nonmovant shows by\xa0affidavit or declaration that, for specified reasons, it cannot present facts essential to\xa0justify its opposition, the court may:\n\n(1) defer considering the motion or deny it;\n\n(2) allow time to obtain affidavits or declarations or to take discovery; or\n\n(3) issue any other appropriate order.\n\n(e) Failing to Properly Support or Address a Fact. If a party fails to properly support an assertion of fact or fails to properly address another party’s assertion of fact as required by Rule 56(c), the court may:\n\n(1) give an opportunity to properly support or address the fact;\n\n(2) consider the fact undisputed for purposes of the motion;\n\n(3) grant summary judgment if the motion and supporting materials — including\xa0the facts considered undisputed — show that the movant is entitled to it; or\n\n(4) issue any other appropriate order.\n\n(f) Judgment Independent of the Motion. After giving notice and a reasonable time to respond, the court may:\n\n(1) grant summary judgment for a nonmovant;\n\n(2) grant the motion on grounds not raised by a party; or\n\n(3) consider summary judgment on its own after identifying for the parties material\xa0facts that may not be genuinely in dispute.\n\n(g) Failing to Grant All the Requested Relief.\xa0If the court does not grant all the relief requested by the motion, it may enter an order stating any material fact — including an\xa0item of damages or other relief — that is not genuinely in dispute and treating the fact as\xa0established in the case.\n\n(h) Affidavit or Declaration Submitted in Bad Faith.\xa0If satisfied that an affidavit or declaration under this rule is submitted in bad faith or solely for delay, the court — after\xa0notice and a reasonable time to respond — may order the submitting party to pay the\xa0other party the reasonable expenses, including attorney’s fees, it incurred as a result. An\xa0offending party or attorney may also be held in contempt or subjected to other appropriate sanctions.\n\n\xa0\n\n\xa0\n\n\xa0\n\n11.1.3\n\nAdickes v. S.H. Kress & Co.\n\n\xa0\n\nSupreme Court of the United States\n\n398 U.S. 144, 26 L. Ed. 2d 142, 90 S. Ct. 1598, 1970 U.S. LEXIS 31, SCDB 1969-101\n\nNo. 79\n\n1970-06-01\n\n[ … ]\n\nCERTIORARI TO THE UNITED STATES COURT OF APPEALS FOR THE SECOND CIRCUIT.\n\n[ … ]\n\nMR. JUSTICE HARLAN delivered the opinion of the Court.\n\nPetitioner, Sandra Adickes, a white school teacher from New York, brought this suit in the United States District Court for the Southern District of New York against respondent S. H. Kress & Co. ("Kress") to recover damages under 42 U. S. C. § 1983[1] for an alleged violation of her constitutional rights under the Equal Protection Clause of the Fourteenth Amendment.',
"I will not be requiring you to read these materials. Nor will you be tested on them. After discussions with a number of colleagues, I decided that I will present an optional lecture or two on sexual assault.\n\n\xa0\n\n\xa0\n\n\xa0\n\n\xa0\n\n13.1\n\nIntroduction\n\n\xa0\n\nTo a greater degree than any of the other crimes we study in this class, the very definition of rape has been a subject of dispute and reform in recent years. Perhaps that is because the basic result element that rape law criminalizes—sexual intercourse—is not, unlike death or battery, itself considered bad. When someone intentionally kills another, there is usually little question (except in cases of self-defense) that the result is bad and that a crime may have occurred. Unlike most intentional killing, intentional sex is not inherently wrong. Indeed, in some situations, much of the evidence of rape may rest in the perceptions and interpretations of the involved parties. \n\nThe traditional elements of rape law are: 1) sexual intercourse; 2) with force; 3) and lack of consent. Because the sexual intercourse element of rape can be difficult to distinguish from lawful, intentional behavior, rape law has struggled to create a regime that balances the punishment of wrongdoers with the protection of the rights of the accused. Originally, rape law established strict rules governing punishable behavior that were under-inclusive and strongly protected accused men: for example, a claim of rape had to include the use of physical force by the accused and physical resistance by the victim. Additionally, there was a spousal exception to rape, so that husbands could not be criminally liable for rape of their wives. \n\nAs the cases in this section demonstrate, however, rape law reform in the past several decades has dramatically affected these requirements. Namely, feminist legal reformers have challenged and in many jurisdictions weakened or eliminated the force requirement. That has shifted more legal focus onto the question of whether there was consent. Consider what problems consent itself may have as a central element of rape law. As you read the cases and essays in this section, consider how different formulations of rape law balance several very serious considerations of our criminal system: punishing wrongdoers; differentiating between levels of blameworthiness; and protecting the rights of defendants. What evidentiary or normative roles did the traditional rape requirements play? What are the risks of limiting or removing them? How should our system balance the risks of over-inclusivity and under-inclusivity? What social and intimate relationships between men and women do the various possible rape rules promote and change? And as always, how do these questions implicate the justifications of punishment such as retribution and deterrence?\n\n\xa0\n\n\xa0\n\n\xa0\n\n\xa0\n\n13.1.1\n\nExcerpt from Criminal Law: Cases, Controversies and Problems (West Academic Publishing 2019) by Joseph E. Kennedy (used with permission).\n\n\xa0\n\nhttps://app.box.com/s/ixs8jw1d0oi45q68xvpk3vl69m2p6y71\n\n\xa0\n\n\xa0\n\n\xa0\n\n\xa0\n\n13.2\n\nStatutes\n\n\xa0\n\nConsider some of these questions while you are reviewing these statutes.\n\nHow do the statutes define sex, if at all? \n\nHow do they define force, if at all? \n\nWhat is the mens rea required? \n\nHow do you think they balance the rights of the accused with the harm to be avoided? \n\nAs a defense attorney, which one would you find most defendant-friendly? \n\nAs a prosecutor, which one would you find most prosecution-friendly?\n\n\xa0\n\n\xa0\n\n\xa0\n\n\xa0\n\n13.2.1\n\nForce v. Non-Consent: An Ongoing Struggle to Define Rape\n\n\xa0\n\nAfter reading the passage from Rusk v. State, below, compare and contrast the MPC's section from 1962 with the proposed section governing sexual assault.\n\n\xa0\n\nPassages taken from the Dissent of\xa0Rusk v. State, 43 Md. App. 476, 406 A.2d 624 (1979),\xa0rev'd,\xa0289 Md. 230, 424 A.2d 720 (1981)):\n\nUnfortunately, courts,[ … ] often tend to confuse these two elements force and lack of consent and to think of them as one. They are not. They mean, and require, different things. [ … ]What seems to cause the confusion what, indeed, has become a common denominator of both elements is the notion that the victim must actively resist the attack upon her.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
dim_768, dim_512, dim_256, dim_128 and dim_64InformationRetrievalEvaluator| Metric | dim_768 | dim_512 | dim_256 | dim_128 | dim_64 |
|---|---|---|---|---|---|
| cosine_accuracy@1 | 0.5637 | 0.5644 | 0.5514 | 0.5158 | 0.4461 |
| cosine_accuracy@3 | 0.7532 | 0.748 | 0.7351 | 0.6869 | 0.6086 |
| cosine_accuracy@5 | 0.8338 | 0.8327 | 0.8229 | 0.7782 | 0.6926 |
| cosine_accuracy@10 | 0.9065 | 0.9069 | 0.8994 | 0.8681 | 0.7905 |
| cosine_precision@1 | 0.5637 | 0.5644 | 0.5514 | 0.5158 | 0.4461 |
| cosine_precision@3 | 0.4369 | 0.4347 | 0.4263 | 0.3982 | 0.3508 |
| cosine_precision@5 | 0.3139 | 0.313 | 0.3084 | 0.2902 | 0.2581 |
| cosine_precision@10 | 0.1772 | 0.1773 | 0.1754 | 0.1692 | 0.1544 |
| cosine_recall@1 | 0.1728 | 0.1731 | 0.1691 | 0.158 | 0.1369 |
| cosine_recall@3 | 0.3954 | 0.3937 | 0.3853 | 0.3604 | 0.3186 |
| cosine_recall@5 | 0.4741 | 0.4728 | 0.4654 | 0.4383 | 0.3895 |
| cosine_recall@10 | 0.5349 | 0.5347 | 0.5289 | 0.5108 | 0.4661 |
| cosine_ndcg@10 | 0.5188 | 0.5183 | 0.51 | 0.4848 | 0.4332 |
| cosine_mrr@10 | 0.6564 | 0.6556 | 0.6436 | 0.606 | 0.5331 |
| cosine_map@100 | 0.4055 | 0.4051 | 0.3979 | 0.3769 | 0.3365 |
anchor and positive| anchor | positive | |
|---|---|---|
| type | string | string |
| details |
|
|
| anchor | positive |
|---|---|
What reasons did the District provide for placing Mr. Kennedy on paid administrative leave after the October 26 game, and how did they justify their concerns regarding his postgame prayers? |
The letter also admitted that, during Mr. Kennedy’s recent October 16 postgame prayer, his students were otherwise engaged and not praying with him, and that his prayer was “fleeting.” Id., at 90, 93. Still, the District explained that a “reasonable observer” could think government endorsement of religion had occurred when a “District employee, on the field only by virtue of his employment with the District, still on duty” engaged in “overtly religious conduct.” Id., at 91, 93. The District thus made clear that the only option it would offer Mr. Kennedy was to allow him to pray after a game in a “private location” behind closed doors and “not observable to students or the public.” Id., at 93–94. |
Why is it considered an abuse of discretion for a district court to require the S.E.C. to establish the "truth" of the allegations against a settling party as a condition for approving consent decrees? |
[ … ] |
Describe the sequence of events that led to Officer McClendon asking Jamison for consent to search his vehicle. What were the key points of contention between Officer McClendon's and Jamison's accounts of this interaction? |
Officer McClendon pulled behind Jamison and flashed his blue lights. Jamison immediately pulled over to the right shoulder.[27] |
MatryoshkaLoss with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
768,
512,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1,
1
],
"n_dims_per_step": -1
}
eval_strategy: epochper_device_train_batch_size: 16per_device_eval_batch_size: 16gradient_accumulation_steps: 32learning_rate: 2e-05num_train_epochs: 4lr_scheduler_type: cosinewarmup_ratio: 0.1bf16: Truetf32: Trueload_best_model_at_end: Trueoptim: adamw_torch_fusedbatch_sampler: no_duplicatesoverwrite_output_dir: Falsedo_predict: Falseeval_strategy: epochprediction_loss_only: Trueper_device_train_batch_size: 16per_device_eval_batch_size: 16per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 32eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 2e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 4max_steps: -1lr_scheduler_type: cosinelr_scheduler_kwargs: {}warmup_ratio: 0.1warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Truefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Truelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Trueignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torch_fusedoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Nonehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseinclude_for_metrics: []eval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseuse_liger_kernel: Falseeval_use_gather_object: Falseaverage_tokens_across_devices: Falseprompts: Nonebatch_sampler: no_duplicatesmulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 |
|---|---|---|---|---|---|---|---|
| 0.1238 | 10 | 59.6933 | - | - | - | - | - |
| 0.2477 | 20 | 20.2066 | - | - | - | - | - |
| 0.3715 | 30 | 10.2468 | - | - | - | - | - |
| 0.4954 | 40 | 7.7729 | - | - | - | - | - |
| 0.6192 | 50 | 6.5815 | - | - | - | - | - |
| 0.7430 | 60 | 5.8646 | - | - | - | - | - |
| 0.8669 | 70 | 5.0228 | - | - | - | - | - |
| 0.9907 | 80 | 4.8557 | - | - | - | - | - |
| 1.0 | 81 | - | 0.5013 | 0.4986 | 0.4888 | 0.4586 | 0.3932 |
| 1.1115 | 90 | 3.0385 | - | - | - | - | - |
| 1.2353 | 100 | 2.9601 | - | - | - | - | - |
| 1.3591 | 110 | 2.8391 | - | - | - | - | - |
| 1.4830 | 120 | 2.9631 | - | - | - | - | - |
| 1.6068 | 130 | 2.6344 | - | - | - | - | - |
| 1.7307 | 140 | 2.4715 | - | - | - | - | - |
| 1.8545 | 150 | 2.7462 | - | - | - | - | - |
| 1.9783 | 160 | 2.5805 | - | - | - | - | - |
| 2.0 | 162 | - | 0.5162 | 0.5142 | 0.5040 | 0.4778 | 0.4242 |
| 2.0991 | 170 | 2.0474 | - | - | - | - | - |
| 2.2229 | 180 | 1.9431 | - | - | - | - | - |
| 2.3467 | 190 | 2.0218 | - | - | - | - | - |
| 2.4706 | 200 | 1.8881 | - | - | - | - | - |
| 2.5944 | 210 | 1.6105 | - | - | - | - | - |
| 2.7183 | 220 | 1.9675 | - | - | - | - | - |
| 2.8421 | 230 | 1.6917 | - | - | - | - | - |
| 2.9659 | 240 | 1.8939 | - | - | - | - | - |
| 3.0 | 243 | - | 0.5188 | 0.5175 | 0.5097 | 0.4840 | 0.4303 |
| 3.0867 | 250 | 1.8625 | - | - | - | - | - |
| 3.2105 | 260 | 1.7864 | - | - | - | - | - |
| 3.3344 | 270 | 1.6404 | - | - | - | - | - |
| 3.4582 | 280 | 1.6378 | - | - | - | - | - |
| 3.5820 | 290 | 1.8484 | - | - | - | - | - |
| 3.7059 | 300 | 1.7864 | - | - | - | - | - |
| 3.8297 | 310 | 1.5436 | - | - | - | - | - |
| 3.9536 | 320 | 1.3438 | 0.5188 | 0.5183 | 0.51 | 0.4848 | 0.4332 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
answerdotai/ModernBERT-base