-
nomic-ai/nomic-embed-text-v1
Sentence Similarity • 0.1B • Updated • 4.79M • 562 -
nomic-ai/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 11.9M • 799 -
nomic-ai/nomic-embed-text-v1-unsupervised
Sentence Similarity • Updated • 808 • 15 -
nomic-ai/nomic-embed-text-v1-ablated
Sentence Similarity • Updated • 155 • 4
Collections
Discover the best community collections!
Collections including paper arxiv:2402.01613
-
Text and Code Embeddings by Contrastive Pre-Training
Paper • 2201.10005 • Published -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
Nomic Embed: Training a Reproducible Long Context Text Embedder
Paper • 2402.01613 • Published • 17 -
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training
Paper • 2405.06932 • Published • 20
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 73 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 53
-
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
EELBERT: Tiny Models through Dynamic Embeddings
Paper • 2310.20144 • Published • 3 -
Dynamic Word Embeddings for Evolving Semantic Discovery
Paper • 1703.00607 • Published • 1
-
EmbeddingGemma: Powerful and Lightweight Text Representations
Paper • 2509.20354 • Published • 48 -
CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity
Paper • 2508.11442 • Published • 4 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 81 -
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Paper • 2402.03216 • Published • 7
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Paper • 2310.05737 • Published • 6 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1 -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Paper • 2305.11554 • Published • 2
-
nomic-ai/nomic-embed-text-v1
Sentence Similarity • 0.1B • Updated • 4.79M • 562 -
nomic-ai/nomic-embed-text-v1.5
Sentence Similarity • 0.1B • Updated • 11.9M • 799 -
nomic-ai/nomic-embed-text-v1-unsupervised
Sentence Similarity • Updated • 808 • 15 -
nomic-ai/nomic-embed-text-v1-ablated
Sentence Similarity • Updated • 155 • 4
-
EmbeddingGemma: Powerful and Lightweight Text Representations
Paper • 2509.20354 • Published • 48 -
CoDiEmb: A Collaborative yet Distinct Framework for Unified Representation Learning in Information Retrieval and Semantic Textual Similarity
Paper • 2508.11442 • Published • 4 -
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models
Paper • 2506.05176 • Published • 81 -
BGE M3-Embedding: Multi-Lingual, Multi-Functionality, Multi-Granularity Text Embeddings Through Self-Knowledge Distillation
Paper • 2402.03216 • Published • 7
-
Text and Code Embeddings by Contrastive Pre-Training
Paper • 2201.10005 • Published -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
Nomic Embed: Training a Reproducible Long Context Text Embedder
Paper • 2402.01613 • Published • 17 -
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training
Paper • 2405.06932 • Published • 20
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 153 -
ReFT: Reasoning with Reinforced Fine-Tuning
Paper • 2401.08967 • Published • 32 -
Tuning Language Models by Proxy
Paper • 2401.08565 • Published • 22 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69
-
TinyLlama: An Open-Source Small Language Model
Paper • 2401.02385 • Published • 95 -
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper • 2401.13601 • Published • 47 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 73 -
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
Paper • 2401.16380 • Published • 53
-
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
EELBERT: Tiny Models through Dynamic Embeddings
Paper • 2310.20144 • Published • 3 -
Dynamic Word Embeddings for Evolving Semantic Discovery
Paper • 1703.00607 • Published • 1
-
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Paper • 2310.05737 • Published • 6 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1 -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Paper • 2305.11554 • Published • 2