Mining Health-Related Issues in Consumer Product Reviews by Using Scalable Text Analytics
Author(s) -
Manabu Torii,
Sameer Tilak,
Son Doan,
Daniel S. Zisook,
Jungwei Fan
Publication year - 2016
Publication title -
biomedical informatics insights
Language(s) - English
Resource type - Journals
ISSN - 1178-2226
DOI - 10.4137/bii.s37791
Subject(s) - data science , product (mathematics) , analytics , computer science , sentiment analysis , scalability , social media analytics , classifier (uml) , population health , population , world wide web , artificial intelligence , social media , medicine , mathematics , environmental health , database , geometry
In an era when most of our life activities are digitized and recorded, opportunities abound to gain insights about population health. Online product reviews present a unique data source that is currently underexplored. Health-related information, although scarce, can be systematically mined in online product reviews. Leveraging natural language processing and machine learning tools, we were able to mine 1.3 million grocery product reviews for health-related information. The objectives of the study were as follows: (1) conduct quantitative and qualitative analysis on the types of health issues found in consumer product reviews; (2) develop a machine learning classifier to detect reviews that contain health-related issues; and (3) gain insights about the task characteristics and challenges for text analytics to guide future research.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom