Go to:
Online Doc
Meeting Request
Daniel Lovins's picture

Toward Semantic Metadata Aggregation for DPLA and Beyond

Monday, June 27, 2016
8:30 am to 10:00 am

What happens when metadata that were created for a specific library catalog are aggregated and repurposed for a network-scale discovery environment like DPLA? What kinds of data modeling, mapping, remediation, and reconciliation are needed in advance of such aggregation?  What happens when metadata from different domains (e.g., galleries, libraries, archives, museums), created with different standards and schemas are forced to interoperate semantically? These are some of the questions will be investigating at the Heads of Cataloging/Metadata Services Interest Group meeting at ALA Annual. Our panelists will be Josh Hadro, Deputy Director of NYPL Labs and Jason Roy, Director of Digital Library Services at the Minnesota Digital Library. Please mark your calendars and join us on Monday, June 27, 8:30-10 am.

E. Shieh's picture


Josh Hadro:
As Deputy Director of NYPL Labs, Josh oversees the Digital Imaging Unit, the Metadata Services Unit, the Permissions unit, and the new Semantic Applications and Data Research program, while also coordinating NYPL's partnerships with the Digital Public Library of America (DPLA), HathiTrust, Google Books, and others. Josh holds a BA from Columbia University and an MSLIS from Pratt Institute. He also teaches at the Pratt Institute, and serves as a board member of the open-access Weave Journal of Library User Experience and the Empire State Digital Network (ESDN).

Jason Roy:

Currently Director, Digital Library Services at the University of Minnesota Libraries. Jason supports the creation of and access to research and scholarly material in digital form from across the campus community. In this capacity he oversees the Libraries’ digitization unit as well as several digital library development projects, including the Ojibwe People’s Dictionary, UMedia Archive and the new Umbra: Search African American History. Most recently, Jason was the project manager for the Minnesota Digital Library – Digital Public Library of America collaboration. Jason has conducted numerous workshops and speaks frequently on the subject of digital libraries and initiatives. Jason holds a B.A. from the University of Oregon and an M.B.A. from the Carlson School of Management, University of Minnesota.



Josh Hadro:
What it means to be a Content Hub
    - ~680K items, covering ~1.2M images
    - NYPL's Metadata API
    - Recent achievements:
        - Public Domain Release (January 2016)
            - Public domain data release on GitHub
        - Implementation of DPLA Rights Statements in the works
Current metadata work and remediation
    - Dealing with uneven metadata, the result of 10+ years of varied practice
    - NYPL Metadata audit currently underway
    - Current metadata practice: importing records from source systems
    - Liaison work in Metadata Services Unit
    - Desired future state: RDF, more connections, and less duplication
    - Metadata skillset trajectories: single records --> Excel --> Open Refine --> Python, etc.

Slides: http://bit.ly/ALA2016AN-HoC_JHadro


Jason Roy:

  • What it means to be a Service Hub
    • About the Minnesota Digital Library
    • Central vs. decentralized aggregation model
    • ~498,000 records harvested
  • MDL’s Extraction – Transformation – Load Tool
  • Working with our partners
    • Data exchange agreements
    • Data review 
    • Outcomes for smaller organizations working with MDL
  • Upcoming/future additions
  • Metadata Constraints | Two case studies
    • Umbra: Search African American History (umbrasearch.org)
    • Metadata enhancements or ‘value adds’ and the struggle with pushing changes upstream in the data flow

Slides: http://bit.ly/ALA2016AN-HoC-JRoy