April 23, 2014

Promise and Problems of Big Data | From the Bell Tower

Academic librarians are exploring new roles in big data and how they can emerge as campus leaders in helping faculty to acquire, store, retrieve, analyze, and preserve it. Big data is being used in interesting ways, but let’s avoid becoming victims of “solutionism.”

Owing to their ability quickly to grasp and gain comfort with new technology trends, academic librarians are poised to emerge as leaders in a new world where data is the coin of the realm. Whether it is in performing service to their own faculty, whose research produces mass quantities of data, discovering ways to tap into repositories of data collected (legally) by others, or even generating internal data about library operations, this profession is definitely getting a handle on the business of Big Data. While this affords academic librarianship a great opportunity at a time when Big Data offers real promise for new discoveries, some critics are alarmed that becoming overly dependent on Big Data could lead to critical miscalculations and bad decisions. Academic librarians should consider these warnings as they engage in the world of Big Data.

Banking on big data

Big-time decisions are already being based on the analysis of Big Data, even though there is insufficient evidence pointing to it as a reliable indicator of patterns upon which makers of critical decisions can depend. A good case study was offered in the article “Giving Viewers What They Want,” which provides insight into how Netflix decided to place a big bet on the show House of Cards as its first major move in the production of original content for its subscribers. While the big news is that the show quickly became the most streamed piece of content in the United States, the real story is about how Netflix picked up the series based on the analysis of massive data sets produced by its subscribers. Think Venn diagram, and the intersection of Kevin Spacey films, David Fincher projects, and a successful UK version of the program. For Netflix, the sweet spot spelled big hit. The power of Big Data, reflecting on Netflix’s infatuation with it, has most experts on the fence. One industry expert claimed, “It is a little hysterical to say that Big Data will win the day now and forever, but it is clear that having a very molecular understanding of user data is going to have a big impact on how things happen in television.” Other programming successes owe it all to years of industry experience and gut instinct. Perhaps it was just a lucky guess, a low access price, or the novelty of making all the episodes available right away, but Netflix believes Big Data made the difference.

Obsessed with data

If, as a nation, we are developing our own infatuation with Big Data, perhaps some of the blame belongs to the movie Moneyball, for glorifying the power of collecting and crunching data over the virtues of intuitive experience. In the novel and movie, the Oakland Athletics leverage Big Data to assemble a championship caliber baseball team out of discarded, has-been players. Investing heavily in data may look appealing, particularly when a win or two leave us feeling satisfied, but the peril is overconfident thinking. In his book To Save Everything Click Here, Evgeny Morozov provides a cautionary tale of the dangers of relying too heavily on Big Data. He has coined a term to describe this problem. “Solutionism” is the belief that with enough data about many complex aspects of life—including not just politics but also crime, traffic, and health—we can fix problems of inefficiency. There are two big problems associated with Big Data. First, even big sets of data have holes of incompleteness. Observations based on this data go astray because what’s missing is vital to the full picture. Second, even when data sets are complete, those analyzing them can easily reach faulty conclusions. Morozov’s advice is that we be careful to avoid allowing technological hubris to crowd out the value of intuitive thinking. A beat cop’s hunches about ways to spot and prevent neighborhood crime may be more accurate than a decade’s worth of crunched crime data.

Approach with caution

Understandably, academic librarians are excited about a future where Big Data plays a big role in our decision-making. Researchers who create Big Data are in need of experts to advise them on managing and preserving their massively generated data, and academic librarians are well situated to fulfill that role. Heading in to the future, we need new opportunities to demonstrate our value to higher education. Our interest in Big Data also crosses over into the management of our own enterprise, where we believe that amassing and analyzing big piles of data can improve our own operations or be applied to demonstrating our own value. For example, there is a growing interest in crunching circulation numbers, database logins, and other transactional data in order to connect it to student success, as evidenced by top grades or positive retention outcomes. As we become increasingly professionally enamored with Big Data, whether it is our own or someone else’s, it may benefit us to bear in mind that it is a double-edged sword that can help or hinder us, depending on how we use it. My belief is that higher education, in the drive to respond to calls for greater accountability, will look to Big Data to make the case that it is worth the cost. There is a place for academic librarians to help their institutions succeed in the effort to capture and capitalize on Big Data. That is all well and good, but let’s remember to bring to it the same sensibility that has made us wise evaluators and collectors of information in the past.

This article was featured in Library Journal's Academic Newswire enewsletter. Subscribe today to have more articles like this delivered to your inbox for free.

Steven Bell About Steven Bell

Steven Bell, Associate University Librarian, Temple University, Philadelphia, PA, is the current vice president/president-elect of ACRL. For more from Steven visit his blogs, Kept-Up Academic Librarian, ACRLog and Designing Better Libraries or visit his website.

Share

Comments

  1. “even big sets of data have holes of incompleteness.”

    Very important to remember. New data sources are cropping up every day and it’s entirely possible that some of them disagree with each other. It’s probably impossible to capture every piece of data out there so look for trends but don’t take anything at face value. Data can be skewed to say almost anything.

  2. Lou Mazzucchelli says:
  3. Munesh Kumar says:

    …the concept of Big Data is new one but it exists since long back years ago before the concepts of documentation and information services. When these concepts were not in face of library services. That time data was very unorganized and unstructured manner. It was very difficult to handle these data with the adjectives of huge, big, unstructured, and unorganized etc, resulting to born of the Documentation and Information Services, adequate information as per the interest of someone, influenced by the subject knowledge and scope only in brief. Rest data which is not being entertained by anyone that had also same value as the data portion demanded. Now, the concept of big data is nothing but that portion of data which was not being entertained since long back years because of unavailability of tools and some time needs also, which is more equally important as organized data in relational and other databases. Basically, the Big Data is like diamond in coal mine, which need expertise skill to locate, understand, furnish and way to use. It is a new ray of hope which opens the spans of research and development and a new assignment for the data modelers as well as librarians. (…as I think about Big Data)

    Munesh Kumar
    India
    mun_esh@hotmail.com