Interoperability Between Institutional and Data Repositories: A Pilot Project at MIT
Author(s) -
Katherine McNeill
Publication year - 2009
Publication title -
iassist quarterly
Language(s) - English
Resource type - Journals
eISSN - 2331-4141
pISSN - 0739-1137
DOI - 10.29173/iq448
Subject(s) - interoperability , computer science , software engineering , database , engineering management , world wide web , engineering
Academic libraries are working in new areas to support the publishing activities of their institution’s faculty members, including helping them to manage and archive research data that they produce. Many institutions, such as the Massachusetts Institute of Technology, have multiple locations in which faculty can deposit their data. Yet this distributed arrangement presents challenges for searching, unifying collections, and archiving. In order to foster some interoperability between these multiple data repositories, the MIT Libraries developed a prototype system to bring studies between two such systems, DSpace and the Institute for Quantitative Social Science Dataverse Network, by enabling the harvesting and replication of metadata and content across the two systems. This paper will discuss the motivation for this project, details and challenges of the system, and future goals for enhancing interoperability among the two systems. Literature Review Many academic library systems, such as the one at the Massachusetts Institute of Technology (MIT), have been developing more services in recent years to support the publishing activities of their faculty. Developing institutional repositories (IRs) for housing and disseminating the digital research materials produced by an institution is a main area of work. Academic librarians play a key role in promoting and facilitating the use of IRs (Bailey 2005). These activities provide new opportunities for librarians to become partners in publishing with their faculty, which can enrich their relationships and increase the library’s relevance (Buehler and Boateng 2005; Bell, Foster, and Gibbons 2005). However, many IRs are experiencing low rates of faculty contribution (McDowell 2007). In order to enhance participation, many librarians are working to evaluate the utility of their IR from their faculty’s perspective. Some institutions have undertaken projects to study faculty work practices in order to design the repository system which best meets faculty needs. One such project discovered that faculty members must be able to personalize their presence in the IR in order for it to provide them with significant value (Foster and Gibbons 2005). A recent study indicates that datasets comprise only a very small percentage of items in IRs (McDowell 2007). In this context, many librarians assist faculty members in publishing their datasets, whether it is in their IR, a domainspecific data repository, or another location. For example, Purdue University library has established the Distributed Data Curation Center (D2C2) to support the curation and archiving of faculty-produced data.1 Success in this work requires an understanding of the needs of individual faculty members in order to recommend to them an appropriate system for managing and archiving their data (Witt and Carlson 2007). Moreover, a viable data archiving system must be of tangible benefit to the depositor, not just the secondary data user. One study argues that a requirement for citation of datasets by secondary users would be the best incentive for faculty to prepare their data appropriately for deposit in a data archive (Niu 2006). For several years, members of the social science data community have been promoting the need for standards for citing data. Some have developed specific standards recommendations designed to interoperate with data repository systems (Altman and King 2007). All these studies shed light on how to design data repositories in alignment with the needs of faculty and researchers. A range of different kinds of digital repositories exists: “individual, discipline-based, institutional, consortial, and national” (Peters 2002). Given this landscape, there often are multiple locations where an individual faculty member can publish and archive data, each of which may have its own approach to and policies regarding archiving and management. These variations in service may make one repository more appealing to a faculty member, and thus implicates the choices she must make (and thus the availability of her research data). How might two kinds of repositories, IRs and domain-specific data repositories, come together? Green and Gutmann envision a collaborative system whereby the IR facilitates communication and exchange of data between the researcher and the domain repository (Green and Gutmann 2007). The ability for different repositories to exchange metadata and content would provide an important service to enable faculty data to be housed and discovered in more than one system.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom