March 22, 2018

Data Scientist Training for Librarians | Not Dead Yet

If I had to name the one aspect of librarianship that has changed the most since I was a newbie librarian, I think it would be data: its use, and librarians’ support of it for their patrons. I’m fascinated by data, and frankly envious of those who are fluent in its use. I also suspect that there are many colleagues out there who would like to know more about how to handle data effectively as librarians. Which is why, when my friend and colleague Chris Erdmann, Head of the John G. Wolbach Library at the Harvard-Smithsonian Center for Astrophysics, told me about a new course at Wolbach called, “Data Scientist Training for Librarians” (DST4L), which aims to “upgrade the skill sets of librarians so that they can better serve the data needs of their communities,” I was all ears.

Fortunately for both you and me, they’ve put together a web site for the course, and here’s its description of what is being covered and produced:

“The class will be provided with real world use cases and hands-on training covering data extraction, cleansing, sharing and presentation. Members of the course will concentrate on bibliographic data sources; learn web programming and database development to the extent that is possible. The group will also look into telling a story with data through various visualization approaches and be introduced to a variety of tools currently used by data scientists. At the end of the course, participants will develop solutions based on data problems presented at the start of the course and post their work to the site.”

The site includes a blog, among whose postings is one by Chris Erdmann showing the DST4L badge, about which he notes:

“When I was brainstorming the course with Louise Rubin, we thought it would be a fun idea to offer a badge to all the course participants. Besides working on group projects which all of us will present closer to the end of the course, the badge is something extra special that participants can post in various ways to demonstrate they’ve taken this course. A local artist, Dirk Tiede, created the badge based on our conversations. The librarian in the badge is loosely based on Nancy Pearl, who if you are a librarian, you will know who I am talking about. Every librarian should have a Nancy Pearl action figure on their desk, with automatic shooshing action and a GI Joe Kung Foo Grip for books. In the image, the heroine data scientist librarian is armed with a one as a spear and a zero as a shield, ready to do battle with data!”

I love the badge, and I also love the content of this course and the range of technologies they’re covering, which include: HTML, PHP/MySQL, Python, Javascript, SQLShare, Google Refine, DataWrangler, DataExplorer, Excel, Fusion Tables, BibSoup, ScraperWiki, Dataverse, FigShare, R, Tableau, Fusion Charts, D3, Protovis, WordPress, Drupal, NoSQL / MongoDB, Solr, and Hadoop.

I asked Chris about how folks could get hold of some of the course materials, and he responded:

“Some material is available via the blog but we will post a more concise version on the site at a later date. We also plan on holding a Google Hangout later on in the course, to share our experience with other librarians that are interested. Finally, we will be posting the student projects to the site to showcase their work. This will all culminate in May. The course schedule is available through the course kit.”

They’re doing interesting things up at Wolbach these days; in addition to the DST4L course, they spearheaded the Liberact workshop held earlier this month, whose aim was to “bring librarians and developers together to discuss and brainstorm interactive, gesture-based systems for library settings.” Check out the featured presentations, as well as a report with more materials from the workshop and more images from the event.

Fascinating stuff? Yes, indeed. And very 21st-century librarianship.

Read eReviews, where Cheryl LaGuardia and Bonnie Swoger look under the hood of the latest library databases and often offer free database trials

Cheryl LaGuardia About Cheryl LaGuardia

Cheryl LaGuardia always wanted to be a librarian, and has been one for more years than she's going to admit. She cracked open her first CPU to install a CD-ROM card in the mid-1980s, pioneered e-resource reviewing for Library Journal in the early '90s (picture calico bonnets and prairie schooners on the web...), won the Louis Shores / Oryx Press Award for Professional Reviewing, and has been working for truth, justice, and better electronic library resources ever since. Reach her at, where she's a Research Librarian at Harvard University.

Facts Matter: Information Literacy for the Real World
Libraries and news organizations are joining forces in a variety of ways to promote news literacy, create innovative community programming, and help patrons/students identify misinformation. This online course will teach you how to partner with local news organizations to promote news literacy through a range of programs—including a citizen journalism hub at your library.


  1. You know what would be very 21st century librarian? Looking to how programmers work on code. They have systems for logging problems, changes and fixes. When multiple librarians may be touching data, you need some way to organize that too