Methods for Coding Tobacco-Related Twitter Data: A Systematic Review | Zendy

Brianna A. Lienemann | Zendy; Jennifer B. Unger | Zendy; Tess Boley Cruz | Zendy; KarHai Chu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Methods for Coding Tobacco-Related Twitter Data: A Systematic Review

Author(s) -

Brianna A. Lienemann,

Jennifer B. Unger,

Tess Boley Cruz,

KarHai Chu

Publication year - 2017

Publication title -

journal of medical internet research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.446

H-Index - 142

eISSN - 1439-4456

pISSN - 1438-8871

DOI - 10.2196/jmir.7022

Subject(s) - coding (social sciences) , computer science , social media , data science , world wide web , internet privacy , statistics , mathematics

Background As Twitter has grown in popularity to 313 million monthly active users, researchers have increasingly been using it as a data source for tobacco-related research. Objective The objective of this systematic review was to assess the methodological approaches of categorically coded tobacco Twitter data and make recommendations for future studies. Methods Data sources included PsycINFO, Web of Science, PubMed, ABI/INFORM, Communication Source, and Tobacco Regulatory Science. Searches were limited to peer-reviewed journals and conference proceedings in English from January 2006 to July 2016. The initial search identified 274 articles using a Twitter keyword and a tobacco keyword. One coder reviewed all abstracts and identified 27 articles that met the following inclusion criteria: (1) original research, (2) focused on tobacco or a tobacco product, (3) analyzed Twitter data, and (4) coded Twitter data categorically. One coder extracted data collection and coding methods. Results E-cigarettes were the most common type of Twitter data analyzed, followed by specific tobacco campaigns. The most prevalent data sources were Gnip and Twitter’s Streaming application programming interface (API). The primary methods of coding were hand-coding and machine learning. The studies predominantly coded for relevance, sentiment, theme, user or account, and location of user. Conclusions Standards for data collection and coding should be developed to be able to more easily compare and replicate tobacco-related Twitter results. Additional recommendations include the following: sample Twitter’s databases multiple times, make a distinction between message attitude and emotional tone for sentiment, code images and URLs, and analyze user profiles. Being relatively novel and widely used among adolescents and black and Hispanic individuals, Twitter could provide a rich source of tobacco surveillance data among vulnerable populations.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research