SentenceTransformer based on google/embeddinggemma-300m

This is a sentence-transformers model finetuned from google/embeddinggemma-300m on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: google/embeddinggemma-300m
Maximum Sequence Length: 2048 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- json

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 2048, 'do_lower_case': False}) with Transformer model: Gemma3TextModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Dense({'in_features': 768, 'out_features': 3072, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (3): Dense({'in_features': 3072, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
  (4): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("congvm/embeddinggemma-300M-triplet-vn-10000-20250924")
# Run inference
sentences = [
    'task: sentence similarity | query: Đất tiềm năng sinh lời cao trong khu tam giác vàng Củ Chi',
    'task: sentence similarity | query: Tiêu đề: cần tiền kinh doanh ra gấp đất bình mỹ gardern\n\n\nMô tả: - diện tích 80m2 (5mx16m), mặt đường nhựa 7,5m.\n- cách võ văn bích 200m, vành đai 3 chỉ 1,5km.\n- khu tam giác vàng của củ chi.\n- tiềm năng sinh lời cao.\n- hình chân thực, chính chủ.',
    'task: sentence similarity | query: Tiêu đề: chùa hà - lô góc - mặt ngõ kinh doanh - ô tô đỗ cửa - 17 tỷ\n\n\nMô tả: chùa hà - lô góc - mặt ngõ kinh doanh - ô tô đỗ cửa - 17 tỷ\n\n* diện tích 38m2 - mặt tiền 4.6m\n\n* vị trí: trung tâm quận cầu giấy, ngõ thông ô tô vào nhà. mặt ngõ kinh doanh sầm uất.\n\n* thiết kế 5 tầng chắc chắn, lô góc thoáng sáng:\n- t1: khách, bếp, wc\n- t2,3,4: mỗi tầng 2 phòng ngủ, wc\n- t5: phòng thờ, sân phơi\n\n* sổ đỏ vuông a4\nliên hệ duy:',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

json

Dataset: json
Size: 690,222 training samples
Columns: query, answer, and negative

Approximate statistics based on the first 1000 samples:

	query	answer	negative
type	string	string	string
details	min: 20 tokens mean: 28.1 tokens max: 40 tokens	min: 61 tokens mean: 231.73 tokens max: 694 tokens	min: 65 tokens mean: 238.34 tokens max: 906 tokens

Samples:

query	answer	negative
`task: sentence similarity`	query: Cho thuê nhà trọ trệt gác trần gần trường học Tân Hưng Q7 giá 4.5 triệu	`task: sentence similarity`
`task: sentence similarity`	query: Cho thuê nhà trọ trệt gác trần gần trường học Tân Hưng Q7 giá 4.5 triệu	`task: sentence similarity`
`task: sentence similarity`	query: Cho thuê nhà trọ trệt gác trần gần trường học Tân Hưng Q7 giá 4.5 triệu	`task: sentence similarity`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Evaluation Dataset

json

Dataset: json
Size: 192,893 evaluation samples
Columns: query, answer, and negative

Approximate statistics based on the first 1000 samples:

	query	answer	negative
type	string	string	string
details	min: 18 tokens mean: 27.09 tokens max: 37 tokens	min: 107 tokens mean: 231.66 tokens max: 728 tokens	min: 62 tokens mean: 238.36 tokens max: 1032 tokens

Samples:

query	answer	negative
`task: sentence similarity`	query: Căn hộ gần Big C và trường học Quận 2	`task: sentence similarity`
`task: sentence similarity`	query: Căn hộ gần Big C và trường học Quận 2	`task: sentence similarity`
`task: sentence similarity`	query: Căn hộ gần Big C và trường học Quận 2	`task: sentence similarity`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
learning_rate: 2e-05
num_train_epochs: 5
warmup_ratio: 0.1
fp16: True
dataloader_num_workers: 8
prompts: task: sentence similarity | query:
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 8
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 5
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: True
dataloader_num_workers: 8
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
prompts: task: sentence similarity | query:
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	Validation Loss
0.0093	100	0.9919	-
0.0185	200	0.2415	-
0.0278	300	0.1604	-
0.0371	400	0.1351	-
0.0464	500	0.1141	-
0.0556	600	0.0998	-
0.0649	700	0.0962	-
0.0742	800	0.0865	-
0.0835	900	0.0871	-
0.0927	1000	0.0798	-
0.1020	1100	0.0764	-
0.1113	1200	0.0754	-
0.1205	1300	0.08	-
0.1298	1400	0.0767	-
0.1391	1500	0.0685	-
0.1484	1600	0.0761	-
0.1576	1700	0.0727	-
0.1669	1800	0.0742	-
0.1762	1900	0.0666	-
0.1855	2000	0.0725	-
0.1947	2100	0.0703	-
0.2040	2200	0.0728	-
0.2133	2300	0.0693	-
0.2226	2400	0.0669	-
0.2318	2500	0.0707	-
0.2411	2600	0.0657	-
0.2504	2700	0.068	-
0.2596	2800	0.0681	-
0.2689	2900	0.0717	-
0.2782	3000	0.0671	-
0.2875	3100	0.0652	-
0.2967	3200	0.0664	-
0.3060	3300	0.0671	-
0.3153	3400	0.0675	-
0.3246	3500	0.0688	-
0.3338	3600	0.0718	-
0.3431	3700	0.0689	-
0.3524	3800	0.0672	-
0.3616	3900	0.0663	-
0.3709	4000	0.0744	-
0.3802	4100	0.0662	-
0.3895	4200	0.0703	-
0.3987	4300	0.0709	-
0.4080	4400	0.0733	-
0.4173	4500	0.067	-
0.4266	4600	0.071	-
0.4358	4700	0.0715	-
0.4451	4800	0.0813	-
0.4544	4900	0.0712	-
0.4636	5000	0.0685	0.0865
0.4729	5100	0.0619	-
0.4822	5200	0.0693	-
0.4915	5300	0.0667	-
0.5007	5400	0.0719	-
0.5100	5500	0.0683	-
0.5193	5600	0.0712	-
0.5286	5700	0.0615	-
0.5378	5800	0.0732	-
0.5471	5900	0.0666	-
0.5564	6000	0.0657	-
0.5657	6100	0.0686	-
0.5749	6200	0.0633	-
0.5842	6300	0.0716	-
0.5935	6400	0.0626	-
0.6027	6500	0.0653	-
0.6120	6600	0.0595	-
0.6213	6700	0.0682	-
0.6306	6800	0.0588	-
0.6398	6900	0.0603	-
0.6491	7000	0.0582	-
0.6584	7100	0.0574	-
0.6677	7200	0.0578	-
0.6769	7300	0.0593	-
0.6862	7400	0.0611	-
0.6955	7500	0.0577	-
0.7047	7600	0.058	-
0.7140	7700	0.0531	-
0.7233	7800	0.0556	-
0.7326	7900	0.0559	-
0.7418	8000	0.0481	-
0.7511	8100	0.0572	-
0.7604	8200	0.0553	-
0.7697	8300	0.0535	-
0.7789	8400	0.0534	-
0.7882	8500	0.0541	-
0.7975	8600	0.0504	-
0.8068	8700	0.0538	-
0.8160	8800	0.0485	-
0.8253	8900	0.0465	-
0.8346	9000	0.0527	-
0.8438	9100	0.045	-
0.8531	9200	0.047	-
0.8624	9300	0.0486	-
0.8717	9400	0.0463	-
0.8809	9500	0.0458	-
0.8902	9600	0.0471	-
0.8995	9700	0.0392	-
0.9088	9800	0.0411	-
0.9180	9900	0.0441	-
0.9273	10000	0.0479	0.0785

Framework Versions

Python: 3.12.9
Sentence Transformers: 4.1.0
Transformers: 4.56.2
PyTorch: 2.6.0+cu118
Accelerate: 1.6.0
Datasets: 3.5.1
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}