Few-shot Pseudo-Labeling for Intent Detection

Abstract
In this paper, we introduce a state-of-the-art pseudo-labeling technique for few-shot intent detection. We devise a folding/unfolding hierarchical clustering algorithm which assigns weighted pseudo-labels to unlabeled user utterances. We show that our two-step method yields significant improvement over existing solutions. This performance is achieved on multiple intent detection datasets, even in more challenging situations where the number of classes is large or when the dataset is highly imbalanced. Moreover, we confirm this results on the more general text classification task. We also demonstrate that our approach nicely complements existing solutions, thereby providing an even stronger state-of-the-art ensemble method.