Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 12
This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False, 'architecture': 'XLMRobertaModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
'What does a guided walk through Hawaii’s hidden gems and vibrant street art look like?',
"but you definitely need to visit the kakaako market for this reason. now, number nine is probably the biggest hidden gem. well, maybe it's not hidden anymore, but it's a gem on this list, which is to visit makua beach, but not just to have a beach day. we're going to swim with dolphins, if possible, because a little bit of information about makua beach is that it's on the northwest part of the island, which is a lot less visited and frequented by tourists. you usually need a rental car to get out to this area. it's about an hour drive from waikiki, so keep that in mind. but this beach just visually is so amazing. this is like what you'd expect a beach to be in oahu, hawaii that is less touched. it's less tainted by tourists. so i'm sorry if i'm doing my part to contribute to that, but i always try and, my little spiel here, encourage, you know, responsible tourism, clean up, pack up where you leave. don't be super loud and disrupt local people. keep to yourself and be",
"might be like walking around and exploring. and it is like the ultimate vibe spot. it's become a little bit of a party spot. so some people might drink and like listen to music locals, not just like tourists. so it's cool to kind of get a little bit of a peek into that sort of lifestyle, but also here people go cliff jumping into the water and it's a lot of fun, but be very careful because the waves can be slightly treacherous if you know you're not being careful because you know you're in the ocean. it's not like a still pond. so do that at your own risk and be safe, of course. but if not, you can just watch other people do it, have some fun, take in the vibes and stay for a magical sunset that will leave you hopefully having the perfect day and perfect trip on oahu, hawaii at china walls. now, number eight, we haven't talked about food in a long time. so let's move on to something where we can fill our bellies and really indulge and go crazy with, which is to visit the kakaako farmer's market. now, this isn't like your typical farmer's market where you just buy some produce from the local people. you can certainly do that here, but it's like a massive food festival because there are two parking lots that are full of vendors making all kinds of food. there's even some like art and",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.6652, 0.4653],
# [0.6652, 1.0000, 0.5162],
# [0.4653, 0.5162, 1.0000]])
anchor, positive, and negative| anchor | positive | negative | |
|---|---|---|---|
| type | string | string | string |
| details |
|
|
|
| anchor | positive | negative |
|---|---|---|
How to use Hawaii’s local transportation? |
faces the sunset and it is a fun place to watch it. it really is. yeah. i got to say waikiki gets a bad rap for sometimes being a bit too crowded, but sunset, the vibe is great. there's often live music. everybody's just enjoying watching that sun go down. so it's a great place to start your vacation. you can even hop in one of the catamarans that has beach loading and go out sailing and watch the sunset that way. that's a great way to start. there you go. gosh, we're good. oh my gosh. you're lucky you found this channel. day number two. and guess what? you are up early, my friend, because you have jet lag. real bad. real bad. especially if you have kids with you. wow. good luck. so we are going to take advantage of that today and you are going to head up to the north shore to go snorkeling at kualima cove or just enjoy a nice beach day up there. kualima cove is at turtle bay resort, which is like five-star luxury resort that's been remodeled, but all beaches in hawaii are public. and ... |
50% of visitors go to oahu for their first time. yes. then after that, we're also including island hopping because we're doing a 10-day itinerary for this. so we recommend going to maui. maui's a great second location. get a different feel for the hawaiian islands. yes. get this. 68% of people come back to hawaii for a second trip. so don't stress about what island to go to for your second time. just enjoy this video. enjoy the itinerary. get a feel for the islands. yes. day one. you land in honolulu airport. you hop in a rental car or take a shuttle because waikiki is only 15 to 20 minutes away. we recommend staying in waikiki because it's centrally located. after that long flight, you're probably craving a relaxing cocktail in sunset. so go to the moana surf rider. they have a beachside bar, which is one of our favorites. grab a cocktail. watch the sun go down because waikiki faces the sunset and it is a fun place to watch it. it really is. yeah. i got to say waikiki gets a bad rap f... |
How to use Hawaii’s local transportation? |
so are you going to oahu, hawaii and you're either too cheap or broke to get a rental car for your time there? there's no judgment in that question because that was my mentality and you definitely can have an amazing time in oahu, hawaii even for like a week or so without any rental car whatsoever. i was there for a month and mostly got by without a rental car so i'm definitely very qualified to give you an amazing seven-day one-week itinerary for oahu, hawaii with no rental car. that being said let's start off with day number zero or arrival day. so unless you're a very strong swimmer or taking a very fast speedboat you'll be arriving at the daniel k. inouye international airport. probably a little bit jet lagged. it's definitely a little bit tired so let's just get to your hotel and check in. i do recommend staying in waikiki which people might kind of roll their eyes or scoff at because it's very touristic. it's like a little bit of a concrete jungle in the middle of hawaii built on... |
back to hawaii for a second trip. so don't stress about what island to go to for your second time. just enjoy this video. enjoy the itinerary. get a feel for the islands. yes. day one. you land in honolulu airport. you hop in a rental car or take a shuttle because waikiki is only 15 to 20 minutes away. we recommend staying in waikiki because it's centrally located. after that long flight, you're probably craving a relaxing cocktail in sunset. so go to the moana surf rider. they have a beachside bar, which is one of our favorites. grab a cocktail. watch the sun go down because waikiki faces the sunset and it is a fun place to watch it. it really is. yeah. i got to say waikiki gets a bad |
How to use Hawaii’s local transportation? |
the hui local bus which is about 20 to 30 minutes or you can walk which is about 50 minutes which hey do you want to walk there and burn out your legs a little bit? it's up to you, depends on how much energy you have to spare. either way the hike isn't so difficult. it costs about five dollars to enter, walking that is, which you'll be doing because you don't have a car of course, and i believe nowadays you have to either reserve a time slot or something like that. it's also very busy so the earlier you go the better. the hike itself is really cool, it's not too difficult like i said, you go through a little bit of a tunnel and at the end you will get some amazing views of waikiki, waikiki beach, and some other parts of the island which you're probably super excited to explore which you'll definitely do on other days besides today. so once you finish hiking diamond head, return back down the way you came, it's an out and back trail, and head to the nearby neighborhood called kaimuki. n... |
hawaii vacation guide. we wanted to make a first time to hawaii itinerary video, but we didn't want to make a boring one, which is actually like kind of difficult when you think about it. so we didn't really want to do like a step by step by step by step and give you all the details because it would probably be like an hour long. so what we decided to do was think of this as like your amuse-bouche, right? it is like it's like warming up your taste buds. then if you head to the link in |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim",
"gather_across_devices": false,
"directions": [
"query_to_doc"
],
"partition_mode": "joint",
"hardness_mode": null,
"hardness_strength": 0.0
}
num_train_epochs: 4learning_rate: 2e-05warmup_steps: 90gradient_accumulation_steps: 4fp16: Truedataloader_drop_last: Trueper_device_train_batch_size: 8num_train_epochs: 4max_steps: -1learning_rate: 2e-05lr_scheduler_type: linearlr_scheduler_kwargs: Nonewarmup_steps: 90optim: adamw_torch_fusedoptim_args: Noneweight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08optim_target_modules: Nonegradient_accumulation_steps: 4average_tokens_across_devices: Truemax_grad_norm: 1.0label_smoothing_factor: 0.0bf16: Falsefp16: Truebf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonegradient_checkpointing: Falsegradient_checkpointing_kwargs: Nonetorch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Noneuse_liger_kernel: Falseliger_kernel_config: Noneuse_cache: Falseneftune_noise_alpha: Nonetorch_empty_cache_steps: Noneauto_find_batch_size: Falselog_on_each_node: Truelogging_nan_inf_filter: Trueinclude_num_input_tokens_seen: nolog_level: passivelog_level_replica: warningdisable_tqdm: Falseproject: huggingfacetrackio_space_id: trackioeval_strategy: noper_device_eval_batch_size: 8prediction_loss_only: Trueeval_on_start: Falseeval_do_concat_batches: Trueeval_use_gather_object: Falseeval_accumulation_steps: Noneinclude_for_metrics: []batch_eval_metrics: Falsesave_only_model: Falsesave_on_each_node: Falseenable_jit_checkpoint: Falsepush_to_hub: Falsehub_private_repo: Nonehub_model_id: Nonehub_strategy: every_savehub_always_push: Falsehub_revision: Noneload_best_model_at_end: Falseignore_data_skip: Falserestore_callback_states_from_checkpoint: Falsefull_determinism: Falseseed: 42data_seed: Noneuse_cpu: Falseaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}parallelism_config: Nonedataloader_drop_last: Truedataloader_num_workers: 0dataloader_pin_memory: Truedataloader_persistent_workers: Falsedataloader_prefetch_factor: Noneremove_unused_columns: Truelabel_names: Nonetrain_sampling_strategy: randomlength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falseddp_backend: Noneddp_timeout: 1800fsdp: []fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}deepspeed: Nonedebug: []skip_memory_metrics: Truedo_predict: Falseresume_from_checkpoint: Nonewarmup_ratio: Nonelocal_rank: -1prompts: Nonebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportionalrouter_mapping: {}learning_rate_mapping: {}| Epoch | Step | Training Loss |
|---|---|---|
| 0.2203 | 50 | 0.9769 |
| 0.4405 | 100 | 0.6070 |
| 0.6608 | 150 | 0.4486 |
| 0.8811 | 200 | 0.4155 |
| 1.1013 | 250 | 0.3437 |
| 1.3216 | 300 | 0.2431 |
| 1.5419 | 350 | 0.2671 |
| 1.7621 | 400 | 0.2587 |
| 1.9824 | 450 | 0.2341 |
| 2.2026 | 500 | 0.1873 |
| 2.4229 | 550 | 0.1768 |
| 2.6432 | 600 | 0.1477 |
| 2.8634 | 650 | 0.1711 |
| 3.0837 | 700 | 0.1344 |
| 3.3040 | 750 | 0.1316 |
| 3.5242 | 800 | 0.1326 |
| 3.7445 | 850 | 0.1211 |
| 3.9648 | 900 | 0.1052 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{oord2019representationlearningcontrastivepredictive,
title={Representation Learning with Contrastive Predictive Coding},
author={Aaron van den Oord and Yazhe Li and Oriol Vinyals},
year={2019},
eprint={1807.03748},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/1807.03748},
}
Base model
BAAI/bge-m3