Collections
Discover the best community collections!
Collections including paper arxiv:2308.03281
-
Text and Code Embeddings by Contrastive Pre-Training
Paper • 2201.10005 • Published -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
Nomic Embed: Training a Reproducible Long Context Text Embedder
Paper • 2402.01613 • Published • 17 -
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training
Paper • 2405.06932 • Published • 20
-
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Paper • 2310.05737 • Published • 6 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1 -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Paper • 2305.11554 • Published • 2
-
Alibaba-NLP/gte-Qwen2-7B-instruct
Sentence Similarity • 8B • Updated • 355k • 479 -
Alibaba-NLP/gte-Qwen2-1.5B-instruct
Sentence Similarity • 2B • Updated • 337k • 229 -
Alibaba-NLP/gte-multilingual-base
Sentence Similarity • 0.3B • Updated • 2.18M • 357 -
Alibaba-NLP/gte-multilingual-reranker-base
Text Ranking • 0.3B • Updated • 104k • 176
-
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
EELBERT: Tiny Models through Dynamic Embeddings
Paper • 2310.20144 • Published • 3 -
Dynamic Word Embeddings for Evolving Semantic Discovery
Paper • 1703.00607 • Published • 1
-
Alibaba-NLP/gte-Qwen2-7B-instruct
Sentence Similarity • 8B • Updated • 355k • 479 -
Alibaba-NLP/gte-Qwen2-1.5B-instruct
Sentence Similarity • 2B • Updated • 337k • 229 -
Alibaba-NLP/gte-multilingual-base
Sentence Similarity • 0.3B • Updated • 2.18M • 357 -
Alibaba-NLP/gte-multilingual-reranker-base
Text Ranking • 0.3B • Updated • 104k • 176
-
Text and Code Embeddings by Contrastive Pre-Training
Paper • 2201.10005 • Published -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
Nomic Embed: Training a Reproducible Long Context Text Embedder
Paper • 2402.01613 • Published • 17 -
Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training
Paper • 2405.06932 • Published • 20
-
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Paper • 2310.05914 • Published • 14 -
EELBERT: Tiny Models through Dynamic Embeddings
Paper • 2310.20144 • Published • 3 -
Dynamic Word Embeddings for Evolving Semantic Discovery
Paper • 1703.00607 • Published • 1
-
Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Paper • 2310.05737 • Published • 6 -
SpeechTokenizer: Unified Speech Tokenizer for Speech Large Language Models
Paper • 2308.16692 • Published • 1 -
Towards General Text Embeddings with Multi-stage Contrastive Learning
Paper • 2308.03281 • Published • 3 -
ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings
Paper • 2305.11554 • Published • 2