Getting Tired of Massive Journal Usage Statistics: A Case Study on Engineering Journal Usage Analysis Using K-Means Clustering
Author(s) -
Qianjin Zhang
Publication year - 2020
Publication title -
2020 asee virtual annual conference content access proceedings
Language(s) - English
Resource type - Conference proceedings
DOI - 10.18260/1-2--34706
Subject(s) - computer science , usage data , cluster analysis , analytics , data science , reuse , service (business) , world wide web , information retrieval , engineering , artificial intelligence , economy , economics , waste management
In 2018-2019, due to increases in the costs of information resources and flat collection budgets, University of Iowa Libraries has experienced a large-scale journal cancellation. As part of the University Libraries system, the Engineering Library went through a difficult process of identifying a list of journals with low usage and high cost, gathering feedback from our users and finalizing a list for cancellation. Since such a difficult situation may occur again in the future, we see the importance of continuously monitoring and evaluating collections in a proactive manner. However, it would be challenging for engineering librarians who are responsible for both collection management and public service to review massive usage statistics on a regular basis. In order to tackle this challenge, we initiated a case study of measuring engineering journal usage in an alternative approach. The dataset was extracted from a data analytics company’s journal usage statistics report prepared for the University of Libraries. We decided to reuse data from their report because it would save us time in data consolidation. The dataset contained journal titles, subfields and three key indicators including the number of publications per journal by authors of our institution, the number of citations to journal made by our authors and the number of downloads. Since the downloads were only available for the most recent four years (from 2015 to 2018), we selected the same period of data for the number of publications and the number of citations. We segmented a total of 821 journal titles into four clusters using K-Means clustering technique where the first cluster of 38 titles with a high number of publications, citations and downloads; the second cluster of 142 titles with a low number of publications but a moderate number of citations and a high number of downloads; the third cluster of titles with a low number of publications and citations but a moderate number of downloads; the forth cluster of titles with a low number of publications, citations and downloads. In conclusion, our case study of measuring engineering journal usage converted massive journal usage statistics into four clusters of journal titles in a straightforward format. The clusters of journal titles also provided us with a comprehensive view on how engineering journals had been used by both authors and users of our institution in the most recent four years. Last but not the least, this case study showed a possibility of implementing data analytics in academic libraries.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom