Metadata Interest Group
8:30 am to 10:00 am, US/Eastern
We have two exciting programs that will discuss strategies and workflows for and challenges associated with large-scale metadata aggregation.
“The Other Side of Linked Data: Managing Metadata Aggregation,” presented by Diane Hillman.
Most of the current activity in the library LOD world has been on publishing library data out of current silos. But part of the point of linked data for libraries is that it opens up data built by others for use within libraries, and has the potential for greater integration of library data within the larger data world. The sticking point for most librarians is that data building and distribution outside the familiar world of MARC seems like a black box, the key held by others. Traditionally, libraries have relied on specialized system vendors to build the functionality they needed to manage their data. But the discussions I’ve heard too often result in librarians wanting vendors to tell them what they’re planning, and vendors asking librarians what they need and want. In the context of this stalemate, it behooves both library system vendors and librarians to explore the issues around management of more fine-grained metadata so that an informed dialogue around requirements can begin. As part of this dialogue, there are a number of questions about goals that could be addressed:
* Will expression in MARC (and/or RDA and/or BibFrame) be part of the requirements?
* How does non-library data fit in (dbpedia, nytimes, amazon, onix)?
* How does schema.org and RDFa fit into the picture?
* Will some data be indexed and not displayed, and vice-versa?
* Who will decide what pieces of available data will be valued and what pieces required?
* Will there need to be an aggregation workflow in addition to a cataloging workflow, or are they best integrated?
“Harvesting and Normalization at the Digital Public Library of America: Lessons from a Diverse Aggregation,” presented by Kristy Berry Dixon (Digital Library of Georgia), Sandra McIntyre (Mountain West Digital Library) and Amy Rudersdorf (Digital Public Library of America).
The Digital Public Library of America currently works with more than 21 digital collections hubs to crosswalk, enrich, and normalize their metadata to align with the DPLA Metadata Application Profile (dp.la/info/map). Metadata is shared in a variety of formats, standards, and readiness and is ingested and made available through the DPLA JSON-LD API (dp.la/info/developers/codex/). In developing the DPLA data model, DPLA staff worked closely with metadata designers from the Europeana Digital Library and from leading U.S. institutions, and has refined the model since launch in April 2013 in response to the experience of working with diverse hubs.
This talk will introduce and outline the challenges of aggregating disparate metadata flavors from the perspective of both DPLA staff and representative hubs. We will review next steps and emerging frontiers as well, including improvements to normalization at the hub level and wider adoption of controlled vocabularies and formats for geospatial metadata and usage rights statements. Finally, we will share plans for implementing Linked Data throughout the aggregated national network and discuss how that will expand opportunities for DPLA and its partners.