Surpem
/

Supertron-embedding-300M

@@ -1,6 +1,20 @@
 ---
 model-index:
-- name: Gemma-Embedding-300m-Finetuned
   results:
   - task:
       type: STS
@@ -70,20 +84,22 @@ model-index:
       value: 50.01057211780597
 ---
-# Gemma-Embedding-300m-Finetuned
 ## Model Description
-This model is a fine-tuned version of the google/embeddinggemma-300m architecture. It has been optimized for semantic textual similarity (STS), retrieval, and classification tasks. The model represents a high-efficiency solution for embedding generation, providing a favorable balance between computational overhead and semantic accuracy.
-- **Base Model:** google/embeddinggemma-300m
-- **Maximum Sequence Length:** 256 tokens
-- **Output Dimensionality:** 1024
-- **Language:** English
-## Evaluation Results
-The model has been benchmarked using the Massive Text Embedding Benchmark (MTEB). The following table summarizes its performance across various task categories:
 | Task Category | Task Name | Metric | Score |
 | :--- | :--- | :--- | :--- |
@@ -94,27 +110,47 @@ The model has been benchmarked using the Massive Text Embedding Benchmark (MTEB)
 | Classification | AmazonCounterfactual | Accuracy | 83.34 |
 | Clustering | TwentyNewsgroups | V-Measure | 50.01 |
-## Usage
-### Sentence-Transformers
-The model can be implemented directly using the `sentence-transformers` library:
 ```python
 from sentence_transformers import SentenceTransformer
-# Load the model from the Hugging Face Hub
-model = SentenceTransformer("your-username/Gemma-Embedding-300m-Finetuned")
-# Define input text
 sentences = [
-    "The atmospheric conditions are favorable for flight.",
-    "The weather is good for flying today."
 ]
-# Generate embeddings
 embeddings = model.encode(sentences)
-# Calculate semantic similarity
 similarity = model.similarity(embeddings[0], embeddings[1])
-print(similarity)

 ---
+license: apache-2.0
+language:
+- en
+base_model:
+- google/embeddinggemma-300m
+pipeline_tag: sentence-similarity
+library_name: sentence-transformers
+tags:
+- mteb
+- sentence-transformers
+- feature-extraction
+- sentence-similarity
+- transformers
+- pytorch
 model-index:
+- name: Supertron-embedding-300M
   results:
   - task:
       type: STS
       value: 50.01057211780597
 ---
+# Supertron-embedding-300M: High-Efficiency Semantic Representation Model
 ## Model Description
+Supertron-embedding-300M is a high-performance, compact embedding model fine-tuned from the google/embeddinggemma-300m architecture. It is specifically designed to provide state-of-the-art semantic representations for Retrieval-Augmented Generation (RAG), semantic search, and document clustering applications while maintaining a low computational footprint suitable for production environments.
+* **Developed by:** Surpem
+* **Model Type:** Sentence Transformer
+* **Architecture:** Gemma-based Dense Transformer
+* **Base Model:** [google/embeddinggemma-300m](https://huggingface.co/google/embeddinggemma-300m)
+* **License:** Apache 2.0
+* **Language:** English (en)
+## Results
+Supertron-embedding-300M demonstrates competitive performance across the Massive Text Embedding Benchmark (MTEB). It is particularly effective in Semantic Textual Similarity (STS) tasks, outperforming many larger models in its weight class.
 | Task Category | Task Name | Metric | Score |
 | :--- | :--- | :--- | :--- |
 | Classification | AmazonCounterfactual | Accuracy | 83.34 |
 | Clustering | TwentyNewsgroups | V-Measure | 50.01 |
+## Get Started
+This model can be easily integrated using the `sentence-transformers` library.
 ```python
 from sentence_transformers import SentenceTransformer
+model_id = "surpem/Supertron-embedding-300M"
+# Load the model
+model = SentenceTransformer(model_id)
+# Define target text
 sentences = [
+    "The financial results exceeded market expectations.",
+    "The company reported better than expected quarterly earnings."
 ]
+# Compute embeddings
 embeddings = model.encode(sentences)
+# Calculate cosine similarity
 similarity = model.similarity(embeddings[0], embeddings[1])
+print(f"Semantic Similarity: {similarity.item():.4f}")
+Training Procedure
+Hyperparameters
+Precision: bfloat16
+Max Sequence Length: 256 tokens
+Optimizer: AdamW
+Batch Size: 256
+Learning Rate: 2e-5
+Citation
+Code-Snippet
+@misc{surpem2026supertron,
+      title={Supertron-embedding-300M: High-Efficiency Semantic Representation Model},
+      author={Surpem},
+      year={2026},
+      url={[https://huggingface.co/surpem/Supertron-embedding-300M](https://huggingface.co/surpem/Supertron-embedding-300M)},
+}