calneymgp commited on
Commit
fb1dfff
Β·
verified Β·
1 Parent(s): bb33fec

docs: add base_model field + tamz.ai backlinks

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -3,6 +3,7 @@ language:
3
  - pt
4
  license: apache-2.0
5
  library_name: sentence-transformers
 
6
  tags:
7
  - sentence-transformers
8
  - feature-extraction
@@ -53,6 +54,8 @@ model-index:
53
 
54
  Fine-tuned from [`ibm-granite/granite-embedding-97m-multilingual-r2`](https://huggingface.co/ibm-granite/granite-embedding-97m-multilingual-r2) using a **Karpathy-style autoresearch loop** β€” 36 autonomous training iterations on an RTX 5090, each proposing and self-validating its own strategy. **35 out of 36 iterations improved the model (97% acceptance rate).**
55
 
 
 
56
  ---
57
 
58
  ## MTEB Benchmark Results (PT-BR)
@@ -186,7 +189,7 @@ embeddings = model.encode(texts, truncate_dim=128)
186
  ## Best For
187
 
188
  βœ… Semantic search over Brazilian business data
189
- βœ… B2B lead discovery and company matching
190
  βœ… Company similarity, clustering, deduplication
191
  βœ… PT-BR RAG pipelines with business documents
192
  βœ… Memory systems for Portuguese AI agents
@@ -222,6 +225,7 @@ Apache 2.0 β€” same as the base IBM Granite Embedding model.
222
  publisher = {HuggingFace},
223
  url = {https://huggingface.co/calneymgp/braza-embedding-ptbr-v1},
224
  note = {Fine-tuned from IBM Granite 97M on 474K Brazilian B2B companies
225
- using 36-iteration autonomous training loop (RTX 5090)}
 
226
  }
227
  ```
 
3
  - pt
4
  license: apache-2.0
5
  library_name: sentence-transformers
6
+ base_model: ibm-granite/granite-embedding-97m-multilingual-r2
7
  tags:
8
  - sentence-transformers
9
  - feature-extraction
 
54
 
55
  Fine-tuned from [`ibm-granite/granite-embedding-97m-multilingual-r2`](https://huggingface.co/ibm-granite/granite-embedding-97m-multilingual-r2) using a **Karpathy-style autoresearch loop** β€” 36 autonomous training iterations on an RTX 5090, each proposing and self-validating its own strategy. **35 out of 36 iterations improved the model (97% acceptance rate).**
56
 
57
+ Built at [**TAMZ**](https://tamz.ai) β€” a Brazilian B2B sales intelligence platform that identifies, enriches, and delivers company leads ready for outreach. The training data comes directly from TAMZ's enrichment pipeline over 32M Brazilian companies from the Receita Federal.
58
+
59
  ---
60
 
61
  ## MTEB Benchmark Results (PT-BR)
 
189
  ## Best For
190
 
191
  βœ… Semantic search over Brazilian business data
192
+ βœ… B2B lead discovery and company matching (e.g. [TAMZ](https://tamz.ai))
193
  βœ… Company similarity, clustering, deduplication
194
  βœ… PT-BR RAG pipelines with business documents
195
  βœ… Memory systems for Portuguese AI agents
 
225
  publisher = {HuggingFace},
226
  url = {https://huggingface.co/calneymgp/braza-embedding-ptbr-v1},
227
  note = {Fine-tuned from IBM Granite 97M on 474K Brazilian B2B companies
228
+ using 36-iteration autonomous training loop (RTX 5090).
229
+ Built at TAMZ (https://tamz.ai)}
230
  }
231
  ```