November 17, 2017

With New Funding, DPLA Sets Sights on Search

The National Endowment for the Humanities (NEH) awarded $1 million to fund the creation of the infrastructure for the Digital Public Library of America (DPLA) last week, and the organization will now turn its focus toward developing a way to search across the many disparate collections involved with the project.

“The first phase of the project was about planning. We brought a lot of people together in meetings and created workstreams that tackled individual areas,” explained Maura Marx, executive director of the Open Knowledge Commons, an affiliate of the Berkman Center for Internet and Society at Harvard Law School that is coordinating the DPLA.  “So, we had people looking at user [experience], people looking at technology, people looking at content, etc.”

Now, DPLA is moving from the planning stage toward more concrete initiatives, Marx said. The requirements for an interface request for proposal (RFP) are being defined, and a preliminary portal is slated to debut in April 2013.

Meanwhile, the DPLA plans to partner with five to seven existing state digital library projects and establish a pilot group of service hubs that will “create a menu of services that would be a core DPLA menu of services. Things like digitization, metadata help, storage, data aggregation, and so on,” she said.

Ideally, as the project grows, every state would eventually have one of these service hubs to coordinate the creation and dissemination of content in their geographic areas. The pilot program will also designate large, existing digital collections, such as the Internet Archive or the HathiTrust Digital Library as “content hubs” and make their data available through DPLA.

“We’re looking at how to create a [digital content] on-ramp for every institution in the country,” Marx said. “The first thing that DPLA has decided to do is to harvest metadata…part of what this grant will do is help us work with a handful of those hubs to define the agreements by which we would harvest data.”

Wrangling metadata

As these service and content hubs share data, another goal will be the development of a technological platform that will integrate collections from disparate sources. This is no simple task.

“The big challenge in these kinds of initiatives has always been trying to figure out some way of creating a common metadata description scheme, or of mapping a metadata description scheme in a way that it’s possible to have consistent user experiences,” Peter Brantley, the director of the Internet Archive’s Bookserver Project and DPLA committee member, told LJ.

As catalogers and metadata librarians are aware, legacy data can be spotty, and every large collection will have inconsistencies.

“Even when a community settles on a small number of standards, there’s often inconsistency in the application of that standard,” Brantley explained. “Enforcing consistency, even where there are guidelines, is an extremely difficult thing to do.”

Compounding the challenge, metadata from different types of collections can vary significantly, and the collections that DPLA will be working with include digitized books, artwork, music, unique archival objects, and special collections.

The goal of the project is to demonstrate how local and national digital collections can be linked, and how the DPLA will ultimately be interoperable with other major digitization projects around the world, such as the Europeana collection. (Europeana is a similar digital library effort in Europe that currently offers access to more than 23 million digitized books, paintings, films, recordings, photographs, and archival records from 2,200 partner organizations.)

The DPLA started out at The Berkman Center in December 2010, with funding provided by the Alfred P. Sloan Foundation. At the time, many viewed the United States as lagging behind national digitization efforts in other countries. The goal of the DPLA was to create an open governing structure that would ultimately link together existing digitization projects within the U.S.

“The idea is to create a big tent where lots of people can work hard toward a public-spirited solution,” John Palfrey, the Faculty Co-Director at the Berkman Center and Vice Dean of Library and Information Resources at Harvard Law School, told LJ at the time. “It’s not a competitive effort. It’s meant to be complementary to its core.”

As DPLA moves forward with the pilot, they are also reshaping the organization. It has assembled a nominating committee consisting of Marx, David Ferriero, Archivist of the United States, Carla Hayden, Chief Executive Officer of the Enoch Pratt Free Library, MacKenzie Smith, Dean of Libraries for the University of California, Davis, and Deanna Marcum, the former Social Librarian of the Library of Congress. This nominating committee will submit a list of candidates for a new board, which will ultimately consist of five to seven people. That board will then choose a new executive director, hopefully by the end of the year.

“We will transition from the DPLA project at Harvard to a new, standalone [501.c 3] organization,” Marx said. “It’s not a ‘Harvard project.’ There have been a million people involved in this, and I think people will really understand that when the new organization is up and running.”

Matt Enis About Matt Enis

Matt Enis (menis@mediasourceinc.com; @matthewenis on Twitter) is Senior Editor, Technology for Library Journal.

Share