
Encoded Archival Description: Data Quality and Analysis
Author(s) -
FranciscoRevilla Luis,
Trace Ciaran B.,
Li Haoyang,
Buchanan Sarah A.
Publication year - 2014
Publication title -
proceedings of the american society for information science and technology
Language(s) - English
Resource type - Journals
eISSN - 1550-8390
pISSN - 0044-7870
DOI - 10.1002/meet.2014.14505101043
Subject(s) - metadata , computer science , visualization , data quality , reuse , quality (philosophy) , set (abstract data type) , world wide web , information retrieval , data visualization , metadata repository , data science , data mining , engineering , metric (unit) , philosophy , operations management , epistemology , programming language , waste management
In order to authenticate the meaning of collections and to preserve their evidentiary value, archivists create documents ( finding aids ) that describe the provenance and original order of the records (MacNeil, [MacNeil, H., 1995]). Metadata standards such as Encoded Archival Description (EAD) enable finding aids to be encoded, searched, and displayed online. However, recent research has begun to draw attention to problems with the quality of EAD finding aid data and metadata, and the encoding practices by which finding aids are created. Since the next frontier in archival description involves reusing finding aid data for advanced information visualization techniques that support additional ways of engaging with collections, there is a pressing need for further study of data quality and how it might impact information visualization. This work analyzes a set of 8729 finding aids aggregated by the Texas Archival Repository Online (TARO) using VADA, a visual analytic tool for finding aids. The results show previously unidentified problems that have significant impact on the ability to visualize this data. The paper explains how these problems relate to both EAD's design and the actual encoding practices of EAD, and provides recommendations for improving the quality of finding aid data.