Premium
Dynamic data science and official statistics
Author(s) -
Thompson Mary E.
Publication year - 2018
Publication title -
canadian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.804
H-Index - 51
eISSN - 1708-945X
pISSN - 0319-5724
DOI - 10.1002/cjs.11322
Subject(s) - data science , statistical inference , inference , data quality , computer science , variety (cybernetics) , sampling frame , official statistics , visualization , population , data mining , econometrics , statistics , artificial intelligence , mathematics , engineering , sociology , metric (unit) , operations management , demography
Many of the challenges and opportunities of data science have to do with dynamic factors: a growing volume of administrative and commercial data on individuals and establishments, continuous flows of data and the capacity to analyze and summarize them in real time, and the necessity for resources to maintain them. With its emphasis on data quality and supportable results, the practice of Official Statistics faces a variety of statistical and data science issues. This article discusses the importance of population frames and their maintenance; the potential for use of multi‐frame methods and linkages; how the use of large scale non‐survey data may shape the objects of inference; the complexity of models for large data sets; the importance of recursive methods and regularization; and the benefits of sophisticated spatial visualization tools in capturing spatial variation and temporal change. The Canadian Journal of Statistics 46: 10–23; 2018 © 2017 Statistical Society of Canada