September 30, 2014

Cooking Up a Crowdsourced Digitization Project that Scales

If the NYPL Labs’ crowdsourced menu transcription project only whetted your appetite, now the University of Iowa Libraries is taking it to the next level with a similar project for transcribing, among other things, recipes.

The libraries are launching DIY History, a new initiative that crowdsources the transcription and tagging of primary sources. The project follows on from the libraries’ first crowdsourcing experiment, the Civil War Diaries and Letters Transcription Project, which debuted in spring 2011 and transcribed over 15,000 pages of diaries and letters.

DIY History offers a broader scope of materials than just the Civil War documents, including the Szathmary Culinary Manuscripts and Cookbooks digital collection, the Iowa Byington Reed Diaries from the Iowa Women’s Archives, and the Nile Kinnick Collection (correspondence and diaries belonging to the Iowan football star).

With masterful understatement, the site says the “time-consuming manual labor” necessary to transcribe handwritten sources “doesn’t scale with traditional library workflows.” Instead, “we’re opening up these collections to anyone who is interested in them,” says Greg Prickman, head of Special Collections. “We are asking people to take an active part in improving the usefulness of the material we offer, and to participate in the process of describing what we hold.”

Users can start transcribing without any training or registration, though they can also create an account to track their contributions, create a watchlist of favorite pages, or participate in discussion forums. To tag and comment on historic photographs or yearbook pages, users must register for Flickr.

On the Civil War project, which also did not require registration to participate, some 903 users chose to provide an optional email, and of course many more did not, according to Jen Wolfe, Digital Scholarship Librarian (Access & Public Engagement) at the University of Iowa Libraries. Wolfe told LJ, “It is a small percentage doing most of the work, which from what I’ve read is typical of crowdsourcing projects… Our top contributor has done more than 1300 pages, many people only do one, and there’s a lot in between. Sometimes we get a group of 20 new users all at once who each do exactly two pages, and that’s when we figure it’s been assigned for a history class somewhere.” Contributions have come from all over the English-speaking globe.

The project experienced “almost no vandalism – just a few prank entries after we were featured on Reddit.com, but everyone else has been very well-behaved,” said Wolfe. Originally the library had assistants checking the entries, but found it didn’t scale, so one of the main goals of the expansion was to allow that function to be crowdsourced also.

As of now, the project is not connected to any other such similar efforts elsewhere, but “we’re very interested in contributing to the Civil War Data 150 project…we hope to get working on that soon,” said Wolfe.

This article was featured in Library Journal's Academic Newswire enewsletter. Subscribe today to have more articles like this delivered to your inbox for free.

Meredith Schwartz About Meredith Schwartz

Meredith Schwartz (mschwartz@mediasourceinc.com) is Senior Editor, News and Features of Library Journal.

Share