
Slang feature extraction by analysing topic change on social media
Author(s) -
Matsumoto Kazuyuki,
Ren Fuji,
Matsuoka Masaya,
Yoshida Minoru,
Kita Kenji
Publication year - 2019
Publication title -
caai transactions on intelligence technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.613
H-Index - 15
ISSN - 2468-2322
DOI - 10.1049/trit.2018.1060
Subject(s) - slang , neologism , computer science , latent dirichlet allocation , construct (python library) , topic model , natural language processing , feature (linguistics) , artificial intelligence , social media , information retrieval , linguistics , world wide web , philosophy , programming language
Recently, the authors often see words such as youth slang, neologism and Internet slang on social networking sites (SNSs) that are not registered on dictionaries. Since the documents posted to SNSs include a lot of fresh information, they are thought to be useful for collecting information. It is important to analyse these words (hereinafter referred to as ‘slang’) and capture their features for the improvement of the accuracy of automatic information collection. This study aims to analyse what features can be observed in slang by focusing on the topic. They construct topic models from document groups including target slang on Twitter by latent Dirichlet allocation. With the models, they chronologically the analyse change of topics during a certain period of time to find out the difference in the features between slang and general words. Then, they propose a slang classification method based on the change of features.