Papers
arxiv:2202.13469

UCTopic: Unsupervised Contrastive Learning for Phrase Representations and Topic Mining

Published on Feb 27, 2022
Authors:
,

Abstract

UCTopic, an unsupervised contrastive learning framework, enhances context-aware phrase representations through cluster-assisted contrastive learning, improving topic mining efficiency and coherence.

AI-generated summary

High-quality phrase representations are essential to finding topics and related terms in documents (a.k.a. topic mining). Existing phrase representation learning methods either simply combine unigram representations in a context-free manner or rely on extensive annotations to learn context-aware knowledge. In this paper, we propose UCTopic, a novel unsupervised contrastive learning framework for context-aware phrase representations and topic mining. UCTopic is pretrained in a large scale to distinguish if the contexts of two phrase mentions have the same semantics. The key to pretraining is positive pair construction from our phrase-oriented assumptions. However, we find traditional in-batch negatives cause performance decay when finetuning on a dataset with small topic numbers. Hence, we propose cluster-assisted contrastive learning(CCL) which largely reduces noisy negatives by selecting negatives from clusters and further improves phrase representations for topics accordingly. UCTopic outperforms the state-of-the-art phrase representation model by 38.2% NMI in average on four entity cluster-ing tasks. Comprehensive evaluation on topic mining shows that UCTopic can extract coherent and diverse topical phrases.

Community

Sign up or log in to comment

Get this paper in your agent:

hf papers read 2202.13469
Don't have the latest CLI?
curl -LsSf https://hf.co/cli/install.sh | bash

Models citing this paper 1

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2202.13469 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2202.13469 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.