z-logo
open-access-imgOpen Access
Overview of Long-form Document Matching: Survey of Existing Models and Their Challenges
Author(s) -
Yaokai Cheng,
Ruoyu Chen,
Xiaoguang Yuan,
Yuting Yang,
Shan Jiang,
Bo Yang
Publication year - 2022
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2171/1/012059
Subject(s) - matching (statistics) , computer science , information retrieval , document clustering , cluster analysis , field (mathematics) , natural language processing , artificial intelligence , mathematics , statistics , pure mathematics
Long-form document matching is an important direction in the field of natural language processing and can be applied to tasks such as news recommendation and text clustering. However, long-form document matching suffers from noisiness and sparsity of semantic information in long text. Using short-form document matching methods on a long-form matching problem is not satisfactory. Long-form document matching has attracted the attention of researchers, who have proposed many effective methods. Methods for matching long texts can be divided into three categories: traditional bag-of-words-based models, traditional deep learning-based models, and pre-training-based models. This study reviews typical methods of long-form document matching, analyzes their advantages and disadvantages, and discusses possible future developments.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here