Premium
Sanitizing Signals in Scholarship and Mass Media: Integrity Informatics I
Author(s) -
Mostafa Javed
Publication year - 2017
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23839
Subject(s) - scholarship , citation , computer science , informatics , information retrieval , library science , political science , law
Imagine an event less than a fraction of a second which occurred 1.3 billion years ago. The event produced a magnetic wave signal so faint that it required several decades of research, about 1000 scientists, and hundreds of million dollars to build an equipment called the Laser Interferometer Gravitational Wave Observatory or LIGO to detect it (Castelvecchi, ). Now, contrast the extremely subtle magnetic wave signals with false statements made by political candidates and supporting organizations in the recent political elections held in the United States. One would assume to check or refute the latter statements is rather trivial and is carried out routinely. Unfortunately, such statements are not only not checked or challenged, but they gain validation through repeated references in the social media and via news content produced by mass media outlets. The contrast between the data analytics capabilities that made the recent LIGO success possible (at least partially) and the use and exploitation of computing platforms for disseminating falsehoods with relative ease is ironic, to say the least. But, it also points to a critical gap. Generally, data analytics has been applied in domains that are highly technical and extremely narrow in scope. The primary use cases for promoting data analytics are typically drawn from scholarly and business fields, while current and potential applications of data analytics in the media arts, communication, journalism, and publishing do not draw as much attention. The application of data analytics even in scholarly contexts, in fact, deserves a closer look and probably should be broadened. A particular area of concern is the peer review enterprise, as significant cracks are occurring in established systems for ensuring and preserving integrity. For example, consider the rise in scientific frauds and the number of retractions being issued by major scientific forums. The same computing power, which in many cases makes possible novel scientific investigations, is also being used to produce charts, graphs, and tables containing fraudulent information so intricately subtle that they are nearly impossible to detect. The increased sophistication of data deceit combined with the tremendous pressure that the readers, reviewers and editors are facing in terms of volume of scholarly production is making the problem a major challenge. As modes and modalities of scholarly communication become even more complex the task of preserving scholarly integrity will only become ever more difficult. The area of scholarly communication, which sits at the intersection of publishing and communication fields, deserves much closer attention from information scholars in terms of development of new techniques, tools and methods for detecting fraudulent and unacceptable scholarly practices. Techniques from literature mining, text analytics, stylometry, and cryptography need closer scrutiny as to their potential application or adaption to the problems of establishing scholarly authenticity, originality, provenance, and accuracy. Consider the problem of validating an author to be the actual source of a manuscript. A famous example is the thread of research which examines the authenticity of Shakespearean documents. Recently, based on past research and new evidence accumulated from text analysis, the Oxford version of the Shakespearean collection has acknowledged and recognized Christopher Marlowe as a co-author of four Henry VI plays (Shea, [2 and Shakespeare, ). Focusing only on