z-logo
open-access-imgOpen Access
Veracity Roadmap: Is Big Data Objective, Truthful and Credible?
Author(s) -
Tatiana Lukoianova,
Victoria L. Rubin
Publication year - 2014
Publication title -
advances in classification research online
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.155
H-Index - 7
ISSN - 2324-9773
DOI - 10.7152/acro.v24i1.14671
Subject(s) - big data , computer science , data science , operationalization , objectivity (philosophy) , credibility , data quality , variety (cybernetics) , quality (philosophy) , data mining , artificial intelligence , metric (unit) , philosophy , operations management , epistemology , political science , law , economics
This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap for theoretical and empirical definitions of veracity along with its practical implications. We explore veracity across three main dimensions: 1) objectivity/subjectivity, 2) truthfulness/deception, 3) credibility/implausibility – and propose to operationalize each of these dimensions with either existing computational tools or potential ones, relevant particularly to textual data analytics. We combine the measures of veracity dimensions into one composite index – the big data veracity index. This newly developed veracity index provides a useful way of assessing systematic variations in big data quality across datasets with textual information. The paper contributes to the big data research by categorizing the range of existing tools to measure the suggested dimensions, and to Library and Information Science (LIS) by proposing to account for heterogeneity of diverse big data, and to identify information quality dimensions important for each big data type.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom