2013 ALA Annual Intellectual Access to Preservation Metadata Interest Group (IAPM-IG) Meeting Minutes
The Intellectual Access to Preservation Metadata Interest Group met June 29 from 3-4pm at the McCormick Place Convention Center, room N135, Chicago, IL . 76 people attended.
Business meeting announcements:
--Outgoing co-chair, Shawn Averkamp, welcomed incoming co-chair, Chelcie Rowell. In August 2013, Chelcie Rowell will graduate from the School of Information and Library Science at UNC-Chapel and will begin as the Digital Initiatives Librarian at the Z. Smith Reynolds Library of Wake Forest University. Chelcie will begin her two-year term with returning co-chair, Sarah Potvin.
--Shawn made a call for future meeting topics. Suggestions can be submitted to the co-chairs or through ALAConnect.
The business meeting was followed by two presentations on preservation metadata in repositories:
PREMIS: to Be or Not To Be in My METS
Jennifer Eustis (Catalog/Metadata Librarian, University of Connecticut Libraries) and David Lowe (Preservation and Data Management Services Librarian, University of Connecticut Libraries) discussed the University of Connecticut Libraries’ process of selecting and implementing a Fedora repository and the issues they faced in integrating preservation metadata, towards TRAC compliance. In 2011,UConn created a working group to investigate alternatives to current repositories that would incorporate a more consistent preservation mission. After selecting Fedora, the group set out to develop a content model and design their METS profile. They determined the minimum metadata requirements—an “Uberset”—and assigned elements from this set to the appropriate split content levels—grouping, container, and media objects. This "atomistic" content model enabled metadata to be split across the three levels. The repository currently supports the recording of ingestion events in PREMIS.
During the process of integrating METS and PREMIS for their Fedora repository, the group encountered a number of issues, including incompatibility with Islandora (the chosen administrative model and presentation layer), difficulty retrieving consistent technical metadata from Archivematica, and problems getting PREMIS into the METS data stream. To remedy these issues, they decided to build their own administrative module (reserving Islandora for the presentation layer only) and to move away from the METS Uberset, towards a more modular solution for packaging and reusing preservation metadata. Next steps will include refining specifications for their metadata modules, determining how to handle PREMIS beyond ingest events, and exploring the incorporation of linked open data.
The Purdue University Research Repository: HUBzero Customization For Dataset Publication And Digital Preservation
Amy Barton (Metadata Specialist and Assistant Professor of Library Science, Purdue University Library) and Carly Dearborn (Digital Preservation and Electronic Records Archivist, Purdue University Library) presented on work done with Neal Harmeyer (Digital Archivist, Purdue University Library). Barton and Dearborn provided an overview of Purdue University’s research repository, PURR, and the metadata they collect and generate to support the long-term preservation of research datasets.
PURR is an instance of HUBzero, an open source LAMP-based platform with Joomla! content management system. Developed at Purdue, PURR was customized for data stewardship, including workflows for curation, publication, dissemination, and preservation of datasets. The project involves a team from the Libraries and serves as a collaborative effort involving the Libraries, Information Technology at Purdue, and the Office of the Vice President for Research.
Using TRAC as their guide, the group collaboratively developed mission statements, policies, job descriptions, and business plans. They decided to commit to preservation of all deposits for ten years, after which time content is subject to Libraries’ selection criteria for further retention. PURR accepts all file formats but recommends sustainable format solutions. Following the OAIS model, content producers submit content (SIP) and the content information is bundled together with Bagit (AIP). Because PURR uncompresses all files for the AIP, the DIP is derived from the original SIP.
PURR metadata incurs the weaving together of standards for preservation. METS is used as the wrapper to package metadata; dcterms is used for descriptive metadata; MODS, to designate dataset ownership and access condition for digital provenance; and PREMIS, for preservation metadata (including technical, rights, and digital provenance metadata). PURR currently records validation, ingestion, and capture events in PREMIS. In addition to capturing preservation events, PURR records the significant properties of datasets, determined through consultation with the content producers, which will aid in future file format migration. Subject specialists check submissions and add keywords to datasets. The presenters concluded with a walkthrough of a dataset submission.
After presentations concluded, speakers took questions from the audience. Attendees asked about Purdue's practice of deriving DIPs from SIPs rather than AIPs, questioned whether Dublin Core was sufficient for describing data, and requested further information about file formats received and migrated. They further asked about the appraisal process that could result in data being deaccessioned after ten years. Both sets of speakers responded to a question about data ownership and questions of terminology and policy.
Presentation slides are available below.