Premium
Learning to cite framework: How to automatically construct citations for hierarchical data
Author(s) -
Silvello Gianmaria
Publication year - 2017
Publication title -
journal of the association for information science and technology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.903
H-Index - 145
eISSN - 2330-1643
pISSN - 2330-1635
DOI - 10.1002/asi.23774
Subject(s) - computer science , citation , xml , correctness , construct (python library) , information retrieval , data science , context (archaeology) , set (abstract data type) , data mining , world wide web , algorithm , programming language , paleontology , biology
The practice of citation is foundational for the propagation of knowledge along with scientific development and it is one of the core aspects on which scholarship and scientific publishing rely. Within the broad context of data citation, we focus on the automatic construction of citations problem for hierarchically structured data. We present the “learning to cite” framework, which enables the automatic construction of human‐ and machine‐readable citations with different levels of coarseness. The main goal is to reduce the human intervention on data to a minimum and to provide a citation system general enough to work on heterogeneous and complex XML data sets. We describe how this framework can be realized by a system for creating citations to single nodes within an XML data set and, as a use case, show how it can be applied in the context of digital archives. We conduct an extensive evaluation of the proposed citation system by analyzing its effectiveness from the correctness and completeness viewpoints, showing that it represents a suitable solution that can be easily employed in real‐world environments and that reduces human intervention on data to a minimum.