Perspectives on Peer Review of Data: Framing Standards and Questions | Zendy

Morten Wendelbo | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Perspectives on Peer Review of Data: Framing Standards and Questions

Author(s) -

Morten Wendelbo

Publication year - 2017

Publication title -

college and research libraries

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.886

H-Index - 52

eISSN - 2150-6701

pISSN - 0010-0870

DOI - 10.5860/crl.78.3.262

Subject(s) - framing (construction) , computer science , data science , information retrieval , history , archaeology

Peer review serves two major purposes: verification and collaboration. A peer reviewer’s role is to ensure, to a reasonable degree, that what journals publish is objective, correct and relevant. Further, a peer reviewer functions to give authors feedback on their writing, methods and presentation. Both aspects play their own part in ensuring the quality of the research published and so, by extension, peer review plays a part in moving the research in a field forward. However, when it comes to data, peer review is often inconsistent or outright absent, and that is hurting every field where data is not subject to the same rigorous standards as the manuscript itself. Data is a powerful tool in research and academic publications. Its use is widespread in most disciplines, not least in Library and Information Sciences; the tangible nature of data adds to its strength, allowing it to be applied to test a hypothesis. Not only can data help show a change over time, but it can quantify the extent of that change. By extending data-measured trends, it can also be used to predict future events. Data is also so powerful because it is difficult to dispute. Readers and peer reviewers rarely have the access and resources necessary to verify a manuscript’s results, much less to review the integrity of the data itself. Thus far, peer review of a manuscript has often functioned as a proxy for the quality of the underlying data: a well-written, wellreasoned and well-supported manuscript is likely to be based on qualified data and established data practices. In other words, a good manuscript acts as a “signal” that the underlying data and practices are reliable. However, with the increasing influence of data, it is pertinent to ask whether that assumption is adequate. A few examples will illustrate why it is not. It is a trite observation that statistics can be whatever you desire them to be. By displaying only certain isolated summary statistics of a larger set of data, nearly any dataset can be made to tell nearly any story. In a similar vein, by asking questions in certain ways, or measuring variables in a dataset in a particular fashion, data can be made to fit a hypothesis, rather than to objectively measure a phenomenon. Without access to the underlying dataset and data work, there are very few ways to tell if a dataset is sufficiently objective to warrant the findings that a manuscript claims. Thankfully, academic dishonesty of that kind is, to my knowledge, quite rare. Instead, data shortcomings often happen because data users lack the skills to understand the limits of data, and the limits of the models the data at hand can support. Data literacy, a subject I teach, has not kept up with the expansion of data use in academics—or in business, for that matter. Peer review of data serves a dual purpose in this case, because it prevents publication of results that the data do not in reality support, and it acts as a screen for those readers who do not have the requisite skills to evaluate data and methods themselves. Bad data practices can have severe, even fatal, consequences depending on the discipline. One of the worst examples is the work behind the now infamous and

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research