November 22, 2017

Reading Chicago Reading Wins Start-Up Grant

oboc-bannerA new pilot study by DePaul University scholars in collaboration with the Chicago Public Library (CPL), has received one of 18 National Endowment for the Humanities (NEH) Digital Humanities Start-Up Grant awards. Reading Chicago Reading: Modeling Texts and Readers in a Public Library System  plans to use data from One Book, One Chicago (OBOC), CPL’s robust 15-year-old community reading program—including circulation statistics, social media data, neighborhood demographics, and textual data—to analyze reading patterns for OBOC books and develop a predictive modeling tool to help drive CPL’s collection development and future OBOC choices.

The project’s codirectors—John Shanahan, associate dean and associate professor of English, and director of the graduate certificate program in digital humanities; Megan Waters Bernal, associate university librarian for information technology and discovery services; Robin Burke, professor of computer science in DePaul’s School of Computing; and Antonio Ceraso, associate professor and director of the masters program in new media studies—will be joined by additional faculty members and graduate students from disciplines across campus to help gather and model the data and develop the tools.

The $74,271 NEH grant, announced on March 23, will supplement the project’s existing funding, which includes a Microsoft’s Azure for Research program, providing free access to Azure cloud computing services for research projects, and a DePaul University Research Council grant that supports an undergraduate research assistant for the project.

Brett Bobley, NEH chief information officer and director of the Office of Digital Humanities, likened the project’s modeling tool to those used by record companies to gauge which audiences may find a particular song appealing. “They’re not suggesting that books should be judged by a computer program,” he explained, but they do think that it will be useful for librarians to get a better sense for how to proceed once the book is chosen: “If we promote it via Twitter that’s really going to help this type of book get more readers, but maybe this other kind of book will be better promoted by having seminars in various libraries in person.”

CITY OF BIG DATA

Since 2001, CPL has chosen one or two books each year around which it organizes citywide public events, book discussions, and other creative programming. OBOC debuted with Harper Lee’s classic to Kill a Mockingbird; selections have included Saul Bellow’s The Adventures of Augie March, Tony Morrison’s A Mercy, Jhumpa Lahiri’s Interpreter of Maladies, and Tom Wolfe’s The Right Stuff, as well as nonfiction works such as Carl Smith’s The Plan of Chicago and Isabel Wilkerson’s The Warmth of Other Suns.

DePaul University has a longstanding relationship with the library, having partnered with CPL on the OBOC program since its inception through on-campus educational workshops, seminars, and English Department courses around the featured book. Jennifer Lizak, coordinator of special projects in CPL’s Department of Cultural and Civic Engagement, told LJ, “When John and his team…came to us with this idea of using One Book as a case study for this theory that they had about literacy, we were definitely more than happy to say yes and work with them because they have been such great partners.”

The genesis of the Reading Chicago Reading, explained Shanahan, was the growing practice of digital humanities studies coupled with the current trend of using big data for cultural analysis. In particular, Mayor Rahm Emanuel’s 2014 campaign to make citywide datasets openly available to help Chicago address various urban issues saw “people really getting excited about having a new kind of data-driven policy space, and new kinds of analysis made possible.” OBOC, Shanahan told LJ, comprises a dataset all its own, “a kind of citywide way of getting the pulse of reading.”

CPL has years’ worth of circulation data, but it has not been analyzed. “Somebody like Rahm Emanuel comes along and says ‘we’re going to be data driven,’ and libraries are at the center of that,” Shanahan said. “They have all this data…. This One Book thing is our way of being a kind of probe into that. We have this repeating event. It’s at city scale. It’s like—why not try to recognize some research patterns out of it?”

While the project is specific to Chicago, Shanahan said, the resulting tools could also be used by library systems in cities of similar size.

ONE CHICAGO, MANY BOOKS

Reading Chicago Reading launched in 2014. It received circulation data from CPL for OBOC’s fall 2011 selection, The Adventures of Augie March, in spring 2015. More recently similar data was obtained for the spring 2012 book, Yiyun Li’s Gold Boy, Emerald Girl.

The project will continue gathering checkout data for each active OBOC season as well as the months before and after. While this doesn’t reveal the total number of the program’s participants—some readers purchase the book, rather than borrow it—the numbers reveal print and ebook circulation patterns, including holds. The DePaul team had hoped to get information dating back to the program’s inception in 2001, but records are only available as far back as 2008; currently they’re modeling data from 2011 forward, and hope to work back as the grant period progresses. In addition, CPL provided breakdowns of OBOC program attendance numbers by branch.

Circulation data for each branch will be coordinated with neighborhood census data. “Since …books have specific demographic profiles” said Shanahan, “…if we could get these mapped out month by month for the different [neighborhoods] we could actually watch these books grow over the city, over time, and hopefully see some patterns.”

Social media data will also become part of the dataset. The team is using Twitter’s application processing interface (API) to follow not only the #OBOC hashtag, but also the book title, author’s last name, and other keywords. At last count, Shanahan told LJ, they had between 8,000 and 9,000 tweets for Thomas Dyja’s The Third Coast, the most recent OBOC selection.

The good thing about social media data, said Shanahan, is that it’s “sticky”—because it comes attached to real names, users’ other likes and recommendations can be pulled out as well. Codirector Burke is an expert in recommender systems, such as the algorithms Amazon or Netflix uses to suggest books to users, and one of the components of Reading Chicago Reading that CPL librarians are enthusiastic about, Shanahan explained, is the opportunity to substantiate word-of-mouth recommendations around OBOC. “We’re hoping that the tweets are full of things like that, where someone says, ‘Hey, that’s great and reminded me of this book,’ and it will become a quantifiably realistic model…. People recommending things to one another—this time, though, for a civic initiative tied to the city, instead of trying to sell products.”

Because the books being examined are in copyright, in order to help with the text analysis Reading Chicago Reading will also make use of the HathiTrust Data Capsule. “One Book by itself, you have some numbers. It doesn’t really tell you a lot,” explained Shanahan. “But how does that book stack up against, say, all the other books by that author? Or, American nonfiction set in a certain area?” The Data Capsule gives researchers a secure, virtual computer—the “capsule”—for access to the HathiTrust Digital Library for a limited time, and will enable the project team to compare OBOC texts to other, bigger textual datasets.

CRUNCHING THE NUMBERS

The next step will be analyzing the data, said Shanahan, which will happen in summer 2016 after classes have ended. The team has engaged a computer science graduate student who will begin topic modeling the content and separating good data from bad—”Unfortunately there’s a Third Coast beer,” noted Shanahan, “so we have to take all those results out.” The computer science and social science group members will enter demographic parameters based on census information and voting patterns.

The ultimate goal of the project will be a predictive modeling tool that CPL can use to help choose future OBOC selections, as well as planning how many copies to buy and how they should be allocated by branch. “Maybe people in computer science are used to doing this,” Shanahan told LJ, “but in an English department, as a digital humanities thing, the idea of being able to model something ahead of time, predict it, and then watch it as a kind of real time experiment, to me is really exciting.”

The choice process is complex, explained Lizak, and could benefit from the augmentation of some hard data. “We start looking at it sometimes two years in advance, and we look at a variety of things,” she told LJ. “We look at what themes we might draw from [suggested] titles, and we also look at things a little more practically, such as…. Is the book widely available? Has it gotten good reviews? Is it a high quality piece of writing? Is it available in various formats? We look at the practical part of it…and then we think about how we might get great programming—is the theme we can draw from this book something that we will be able to create interesting and engaging programming that will draw people to our branches for five or six months of the year?” The list of suggestions is then narrowed down to a short list based on input from staff, discussed with CPL commissioner Brian Bannon (a 2009 LJ Mover & Shaker), and ultimately presented to mayor Emanuel for final signoff. “Having the data, we’ll be able to make some better predictions on how the season might go ahead of time, which I think will help us to be agile in programming, marketing, ways of reaching more people,” Lizak noted.

While the identity of the fall 2016 choice is a secret to the public, the Reading Chicago Reading team knows what it will be and is getting ready to run the model on it against the OBOC historical data this summer.

A WIDER AUDIENCE

While NEH does not customarily fund this type of data-driven social science analysis, said Bobley, Reading Chicago Reading aligns well with NEH’s agency-wide initiative The Common Good: The Humanities in the Public Square, introduced in January 2015 by NEH chairman William D. Adams in January 2015 (for more Common Good­–connected projects, see LJ’s coverage of Humanities Open Book and the Public Scholar Program). “The humanities is not just about scholars talking to other scholars,” said Bobley, “but rather about a much wider audience engaging the public in history, philosophy, literature, and all the other pieces of the humanities.”

Bobley added, “I think this is a step toward getting a better handle on how we reach a public audience. Libraries are some of the greatest humanities institutions in our country. We have an amazing library system with reach into communities in every state. Let’s invest a little bit more in figuring out more about the relationship between libraries and their patrons and the books that they’re interested in. From our perspective, that’s what interested us.”

Reading Chicago Reading will hold a small workshop in late fall/winter in conjunction with the annual Chicago Colloquium on Digital Humanities and Computer Science, focusing on the application of end-user data in library applications. According to the project’s grant application, “This will give us a chance to discuss our results and methodology, to identify scholars with similar interests, and to build community around this research area.” Shanahan also hopes to incorporate the process into the classroom, using it as “a teaching tool for students to think through how to quantify literature, to map it over a city.”

Lisa Peet About Lisa Peet

Lisa Peet is Associate Editor, News for Library Journal.

Share