February 6, 2016

Navigating User-Generated Resources: A Q&A with Computer Scientist Brent Hecht

Brent HechtUser-generated content (UGC)—which includes tweets, reviews, Facebook posts, and Wikipedia articles—now plays a key role in the average person’s Internet experience. UGC is also becoming an indispensable resource for helping researchers make sense of big data. In his Wednesday keynote address “The Mining and Application of Diverse Cultural Perspectives in User-Generated Content” at the Electronic Resources and Libraries (ER&L) conference in Austin this week, Brent Hecht, assistant professor of computer science and engineering at the University of Minnesota, will discuss how “UGC reflects the cultural diversity of its contributors to a previously unidentified extent and that this diversity has important implications for Web users and existing UGC-based technologies.”

Prior to the event, LJ spoke with Hecht about the intersection of geography and computer science, the influence of UGC, and why librarians are needed to help patrons navigate popular UGC resources such as Wikipedia.

LJ:  You have an M.A. in geography, and a Ph.D. in computer science. How do those fields intersect?

Brent Hecht: I believe I was the first computer science/geography double major at my college in 2005, but I doubt that’s the case anymore. It used to be hard to explain, but these days, all you have to say is “Google Maps.”

I’m in a subfield of computer science called Human-Computer Interaction… it includes everything from [improving Google Maps] to understanding how information flows across space via social networks, to developing cool technologies that, for instance, let you take a picture of a publicly displayed local map and then use that for navigation instead of Google Maps—so it’s sort of in the augmented reality space. There are also people working really hard at the very challenging problem of figuring out when someone types “London” into the search bar, do they mean London, England, or London, Ontario, or any of the other Londons that are out there? There’s quite a wide-ranging set of usage questions at the intersection of those two fields.

How would you describe the role that user-generated content now plays in the average person’s Internet experience?

It’s all over the place. A ridiculous percentage of search queries have Wikipedia results in the top three—in Bing and in Google. Wikipedia is the sixth most popular website in the world. There’s also Amazon customer reviews… Twitter for news and social connections, Facebook obviously, YouTube—YouTube gets over an hour of video [uploaded] every second—I could go on forever.

A project in 2010 looked at how local user generated content is. There was some disagreement in the literature.

What we pointed out in that paper was that it used to be when people wanted to find out about a city, they would go to that city’s webpage. Now, typically, most folks go to the Wikipedia article about the city. The importance of it really can’t be overstated. A community of people with no credentials—classic user generated content in context—is defining the way that people understand the spaces around them.

By and large the [facts] are accurate. This is sort of the miracle of Wikipedia—if you get a large enough group of people together, they will be able to find mistakes, for the most part.

Where I think the model has broken down so far is in areas of coverage. The English Wikipedia, for instance, covers very extensively cultural and geographic topics that English speakers are interested in. Same deal with the German Wikipedia, the Spanish Wikipedia, the French Wikipedia, and so on…. There’s this perception that the English Wikipedia is so big, that it’s a superset of all the other language editions. That is actually not the case…. If you read an English language article about a concept that also has articles in other language editions, you are, on average, missing out on about 28 percent of the content you would get if you could read all of the language editions. That’s based on a dataset of [the largest] 25 language editions…. That’s a new reason why we need librarians—to help understand the cultural context of the information we’re reading, and help us gain information from other cultural contexts.

Are there certain topics or areas of coverage that the English language Wikipedia is considered better for than others? One criticism seems to be that it can be heavy on pop culture and light on other topics, for example.

There’s a reasonable hypothesis that the opposite is actually true. We’re actually starting a research project here [at the University of Minnesota] and one graduate student in our lab has begun to look at articles according to a quadrant defined by a popularity axis and a quality axis. So the articles in the upper right quadrant would be both highly popular and high quality. That’s ideal. And then in the lower left-hand quadrant, you get not very popular, low quality. But the other two quadrants are interesting. The high popularity, low quality quadrant, to me, is most interesting. What are people accessing a lot but getting low quality information from?

This article was featured in Library Journal's Academic Newswire enewsletter. Subscribe today to have more articles like this delivered to your inbox for free.

Matt Enis About Matt Enis

Matt Enis (menis@mediasourceinc.com; @matthewenis on Twitter) is Associate Editor, Technology for Library Journal.

Create the Library Your Community or Campus Needs
LTC Online Course Join Library Journal and a roster of design experts for our latest 4-week interactive online course. Starting January 27, 2016, Library Design Workshop will guide participants through complex issues of library space design projects such as space programming, fundraising, and finding the right design team.
  • Develop a roadmap to create a flexible library space suited to your community.
  • Inspiring ideas, concepts, and perspectives from leaders in the library design field.
  • Build a framework to create a robust report for key stakeholders.
SELF-eLearn More
SELF-e is an innovative collaboration between Library Journal and BiblioBoard® that enables authors and libraries to work together and expose notable self-published ebooks to voracious readers looking to discover something new. Finally, a simple and effective way to catalog and provide access to ebooks by local authors and build a community around indie writing!
View TDS Archive
On October 14, 2015 Library Journal, School Library Journal, and thousands of library professionals from around the world gathered for the 6th annual Digital Shift virtual conference to focus on the challenges and opportunities presented by the digital transition’s impact on libraries, their communities, and partners. Now available on-demand, this year’s program provides actionable answers to some of the biggest questions our profession faces for and from libraries of all types – school, academic, and public and features thought-provoking keynotes from John Palfrey, author of BiblioTech: Why Libraries Matter More Than Ever in the Age of Google, and Denise Jacobs, tech leader, author, and creativity evangelist.