9 14

Seongtae Hong

hongst

https://scholar.google.com/citations?user=6uU-QJAAAAAJ&hl=en

tate-hong-nlp

AI & ML interests

NLP

Recent Activity

upvoted a paper 1 day ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

authored a paper 2 days ago

Cross-Lingual Optimization for Language Transfer in Large Language Models

authored a paper 2 days ago

Semantic Aware Linear Transfer by Recycling Pre-trained Language Models for Cross-lingual Transfer

View all activity

Organizations

upvoted a paper 1 day ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published 5 days ago • 67

upvoted a paper 5 days ago

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published 7 days ago • 49

upvoted 2 papers 7 days ago

Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval

Paper • 2604.04734 • Published 10 days ago • 11

Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment

Paper • 2604.05684 • Published 9 days ago • 9

upvoted a collection 2 months ago

ConTEB evaluation datasets

Collection

Evaluation datasets of the ConTEB benchmark. Use "test" split where available, otherwise "validation", otherwise "train". • 8 items • Updated Jun 2, 2025 • 3

upvoted a paper 2 months ago

Diffusion-Pretrained Dense and Contextual Embeddings

Paper • 2602.11151 • Published Feb 11 • 23

upvoted an article 4 months ago

Article

Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries

Dec 22, 2025

•

upvoted a collection 4 months ago

🦢SWIM-IR Dataset [NAACL'24]

Collection

29 million Synthetic Wikipedia-based Multilingual Retrieval Training Pairs. • 4 items • Updated Mar 31, 2025 • 8

upvoted a collection about 1 year ago

Embedding Model Datasets

Collection

A curated subset of the datasets that work out of the box with Sentence Transformers: https://huggingface.co/datasets?other=sentence-transformers • 70 items • Updated Dec 10, 2025 • 165

Seongtae Hong

AI & ML interests

Recent Activity

Organizations

hongst's activity

Nano-BEIR: A Multilingual Information Retrieval Benchmark with Quality-Enhanced Queries