Premium
An Introduction to Data Cleaning Using Internet Search Data
Author(s) -
GreenwoodNimmo Matthew,
Shields Kalvinder
Publication year - 2017
Publication title -
australian economic review
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.308
H-Index - 29
eISSN - 1467-8462
pISSN - 0004-9018
DOI - 10.1111/1467-8462.12235
Subject(s) - documentation , outlier , the internet , computer science , transparency (behavior) , data mining , data science , judgement , database , world wide web , computer security , artificial intelligence , programming language , political science , law
This article considers the issue of data cleaning. We use state‐level data on internet search activity in the United States to illustrate several common data cleaning tasks, including frequency conversion and data scaling as well as methods for handling sampling uncertainty and accommodating structural breaks and outliers. We emphasise that data cleaning relies on informed judgement and so it is important to maintain transparency through careful documentation of data cleaning procedures.