Premium
Building a parsimonious model for identifying best answers using interaction history in community Q&A
Author(s) -
Shah Chirag
Publication year - 2015
Publication title -
proceedings of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.193
H-Index - 14
ISSN - 2373-9231
DOI - 10.1002/pra2.2015.145052010051
Subject(s) - computer science , classifier (uml) , machine learning , artificial intelligence , feature (linguistics) , set (abstract data type) , data mining , philosophy , linguistics , programming language
Evaluating answer quality or identifying/predicting which answer would be selected as the best for a given question is an important problem in community‐based Q&A services. In this article we introduce new interaction‐based features depicting the amount of distinct interactions between an asker and answerer over time, in order to predict whether an answer will be selected as Best Answer or not within Yahoo! Answers. Through a series of experiments ran on a data set of 23,218 question‐answer pairs, we determined that after the data was first run using a model trained on textual features, and then the failed cases re‐run with a model trained on interaction features, we were able to significantly improve the performance of the original model in identifying these difficult cases. In addition, when compared to models using often five to seven times the amount of features and requiring a large amount of computational effort, our model performed at to above the same evaluative measures. This suggests that future classification models can be made more parsimonious and handle larger datasets using less computational effort by developing a two‐step classifier that includes interaction history as a feature.