April 21, 2018

Linked Data in the Creases | Peer to Peer Review

dorothea-salo-newswireBlinkered by BIBFRAME, have we missed the real story?

I keep you in the creases/ I hide you in the folds/ Protect you from the sunlight/ Shield you from the cold./ Everybody said they were glad to see you go/ But no one ever has to know.

—Amber Rubarth, “In the Creases

American catalogers and systems librarians can be forgiven for thinking that all the linked-data action lies with the BIBFRAME development effort. BIBFRAME certainly represents the lion’s share of what I’ve bookmarked for next semester’s XML and linked-data course. All along, I’ve wondered where the digital librarians, metadata librarians, records managers, and archivists—information professionals who describe information resources but are at best peripheral to the MARC establishment—were hiding in the linked-data ferment, as BIBFRAME certainly isn’t paying them much attention. After attending Semantic Web in Libraries 2013 (acronym SWIB because the conference takes place in Germany, where the word for libraries is “bibliotheken”), I know where they are and what they’re making: linked data that lives in the creases, building bridges across boundaries and canals through liminal spaces.

Because linked data is designed to bridge diverse communities, vocabularies, and standards, it doesn’t show to best advantage in siloed, heavily standardized arenas such as the current MARC environment. If BIBFRAME sometimes feels uncompelling on its own, this is likely why! Linked data shines most where diverse sources and types of data are forced to rub elbows, an increasing pain point for many libraries trying to make one-stop-shopping discovery layers and portals. I first noticed an implementation that spoke to that truth in 2012, when the Missouri History Museum demonstrated its use of linked data as a translation layer among disparate digital collections with differing metadata schemes. SWIB13 offered plentiful examples of similar projects, including an important one from the U.S. side of the pond. In building the AgriVIVO disciplinary search portal, Cornell University walked away from traditional crosswalks, instead finding the pieces of information it needed from whatever metadata its partners could give it and expressing those in linked data. This just-in-time aggregation approach lets AgriVIVO welcome and enhance any available metadata, while avoiding tiresome and often fruitless arguments about standards and metadata quality.

What interests me most about this design pattern is how it neatly bypasses problems that led earlier aggregation projects to fail. The ambitious National Science Digital Library project of the mid-2000s foundered, owing to the average science project’s inability to get to grips with XML, never mind setting up as an OAI-PMH provider. (Chapter 10 of Carl Lagoze’s dissertation offers the gory details, for those interested.) AgriVIVO, instead, takes Postel’s law to heart: it accepts whatever it is given and gives the cleanest linked data it can back to the web. As this design pattern catches on, we could see less friction and standards-squabbling among information communities, which will be free to describe their materials as they see fit, while still contributing to the growing interconnection of the cultural heritage web. Librarians, archivists, and museum and gallery curators meshing together on the web while doing their own thing—what an opportunity!

It should surprise no one that the premiere conference for semantic web technologies in libraries is held in Europe; European libraries have led actual linked-data implementation all along. If I had to guess why, I would point to their small size, small numbers, and resulting agility, as well as their clear and unchallenged technology leadership within their country’s libraries. European national libraries, from what I can see, tend not to bog down as much as American library communities do in grindingly political, perfectionistic, top-down standards processes. Instead, they eye possibilities critically and solve problems however they think best, unconstrained by one-true-standard thinking.

This lent a delightfully grounded ambition to several of the development projects I saw at SWIB13. I was taken rather aback at first by the notion of an entire e-resource management system predicated on linked data—it struck me as frighteningly complex and fraught—but on second thought, if developer Leander Seige is solving a real data-integration problem for his library with the tools he has to hand, why not? Similarly, the ontology- and vocabulary-mapping projects at the Plattner Institute, Stuttgart Media University, and Mannheim University Library are not random pie-in-the-sky experiments but active real-world problem-solving, where linked data is the best-fit solution rather than just a trendy buzzword.

The presentation that most refined my thinking about linked data was Martin Malmsten’s “Decentralisation, Distribution, Disintegration—Towards Linked Data as a First Class Citizen in Libraryland” (I would link to the video if I could, as the slides capture very little of Malmsten’s compelling arguments). Malmsten sold me at once when he related how the National Library of Sweden, sick of MARC behaving as a stumbling-block in many of their projects, declared “Linked Data or die!” and audaciously set about making it happen. Along the way, the Swedish developers discovered that serialization formats like MARC and XML, as well as standards like METS, constrain innovative thinking too much and invariably involve shoehorning data into forms and formats that don’t quite fit it.

What linked data let Malmsten and his compatriots do was express their data in the manner best befitting it, while “keep[ing] formats and monsters on the outside” by automating the reexpression of the data in older, staider standards as necessary—and only as necessary. If broadly adopted, the National Library of Sweden’s approach frees us from the eternal lipstick-on-pig question of how best to present eccentric, often inadequate, almost always expensively homegrown data to patrons. Instead, we will put the patron experience first, asking, “What data do patrons actually want to see or use, and before we go creating it, does it perhaps exist already in the vast existing web of data?”

Malmsten also made clear that “Linked Data + UX = actually useful data.” Linked data on the inside is a hard sell without obvious user experience benefits on the outside for both patrons and librarians, a point with which my rather eccentric keynote entirely agreed. For that reason, France’s OpenCat effort was my favorite linked-data project from SWIB13. Since the National Library of France has already done considerable linked-data authority control on names, subjects, and titles, it is now leveraging it to build lightweight, easy-maintenance, enriched OPACs for some of the smallest libraries in the country, libraries too small for MARC to be an easy or comfortable fit.

After SWIB13, I firmly believe that it isn’t the big standards-development efforts that will shape linked-data adoption in libraries. Linked data will grow in the creases, the folds, the cracks of our notoriously rickety metadata edifices. It will often grow in the dark unnoticed, shielded by its champions, as with a project I heard about informally (and won’t name) that nearly died by cold administrative fiat before its developer made it too amazing to kill off. As it quietly solves stubborn problems, empowers our smallest libraries, and connects institutions big and small with the larger web, linked data will remake more and more library data in its image—and if good interface design practices come along for the ride, no one ever has to know!

Dorothea Salo About Dorothea Salo

Dorothea Salo is a Faculty Associate in the School of Library and Information Studies at the University of Wisconsin-Madison, where she teaches digital curation, database design, XML and linked data, and organization of information.

The Latest Trends in Library Design
Hosted in partnership with Salt Lake County Library and The City Library—at SLCo’s Viridian Center—the newest installment of our library building and design event will let you dig deep with architects, librarians, and vendors to explore building, renovating, and retrofitting spaces to better engage your community.
Facts Matter: Information Literacy for the Real World
Libraries and news organizations are joining forces in a variety of ways to promote news literacy, create innovative community programming, and help patrons/students identify misinformation. This online course will teach you how to partner with local news organizations to promote news literacy through a range of programs—including a citizen journalism hub at your library.


  1. Barbara Fister says:

    That is a WONDERFUL keynote presentation that you gave! I also love the subversiveness of this message – and, for someone at a very small library with a very small staff, it’s encouraging. Most of what we do is in the creases. We’re like the Borrowers, behind the baseboards and in the walls. But you can make things works without a big to-do.

  2. This is wonderful, and I cannot tell you how long I have been struggling with Linked Data (despite quite a high knowledge of metadata, RDF, etc.), due to all the conflicting and flat-out wrong assertions people have made to me. I will take your presentation’s call to make Linked Data for a small place go WOW to heart.

  3. DS – your description of LibLinkedData, and especially of the Swedish National Library experience, re-jiggles a niggling thought I’ve had of late, which is that linked data does not make a real difference if we insist on using standard library cataloging. I can’t imagine that much will change if, on the other hand, everything has to stay the same. Are we brave enough to confront that?

  4. Karen Coyle says:

    Dorothea Salo was unable to post here (!?), so send me this email, which I copy here with her permission:

    Karen – I think we’ll be forced to confront it. Linked data should cut
    down on redundant record rekeying, “local practice” record editing
    based on OPAC user-interface constraints (or sheer stubbornness), too
    much dinking around with edge cases, and so on. Mincing no words,
    these practices consume much too much time, money, and energy relative
    to their value. Libraries can’t afford them.

    Will there be a lot of denial and pushback? Sure. We’ve both seen
    plenty of denial on the BIBFRAME list, and it’s hardly limited to
    there. Will librarians feel bad that their work over the years has
    become obsolete? Sure; we’re seeing that already too, and while I’m
    sympathetic to that pain, I’m also an ex-IR-manager — I’m well-used
    to sucking it up, admitting that current processes are useless or
    obsolete, and moving the heck on.

    The key question in my head is when it becomes painfully, crystally
    clear that linked data saves money and time and produces a superior
    patron (and ideally, librarian) experience over MARC. France’s OpenCat
    is a step on the way there, and I look forward to seeing what Sweden
    does with Malmsten leading, and what happens to schema-bibex. But
    we’re not there yet, partly because (as is typical for us) we started
    from the Big Standards end, not the make-it-work-better end.

  5. I must take issue with a couple of points here.

    1) “Linked data should cut down on redundant record rekeying.”
    I have seen this sentiment expressed a number of times in various places, but I don’t see how linked data will change any amount of rekeying for a working cataloger. Ever since OPACs have appeared, if catalogers find “copy,” they don’t have to rekey anything but they derive their current record from the one that already exists and change whatever they need. I don’t see how linked data would save the cataloger any work at all, at least not in a practical way. The copy record already includes all relevant “work” and “expression” information now. (That is, so long as the record is well done)

    In addition, in a linked data universe, catalogers will still be working with what they can understand, i.e. textual strings, and not the URIs. So, for instance, a human cataloger will still add:
    “Tchaikovsky, Peter Ilich, 1840-1893”
    “Чайковский, Петр Ильич, 1840-1893”
    “Čajkovskij, Petr Il’jič, 1840-1893”
    or whatever the meaningful form is instead of:

    which means nothing to any human being. The URI will be added automatically as an addition or in place of the textual form. Adding the linked data may help *users* but not the catalogers.

    2) “Will librarians feel bad that their work over the years has become obsolete?”

    Is it true that our work really is obsolete? I have thought about this a lot and concluded: both yes and no. It depends on if we can figure out new ways to use the information we have been making. It is probably best shown in the movie “Other People’s Money” http://en.wikipedia.org/wiki/Other_People%27s_Money that depicts a hostile takeover of an older corporation that makes wire. The corporate raider (Danny de Vito) is buying up corporations that are considered “past it” and is selling off whatever he can. Now he has this wire company in his sites.

    de Vito’s speech to the shareholders is really memorable and says that the company is already dead. He compares it to horses and buggies and contains the line “You know, at one time there must’ve been dozens of companies making buggy whips. And I’ll bet the last company around was the one that made the best goddamn buggy whip you ever saw. Now how would you have liked to have been a stockholder in that company?” http://www.youtube.com/watch?v=62kxPyNZF3Q

    What was the answer in the movie? In essence, to find new uses for the “buggy whips/wire”. It turned out that other companies needed their wire to make airbags, so this terribly depressing movie actually ends on a hopeful note. It is a movie that I think librarians should watch.

    Taking this as an analogy for cataloging (and libraries as a whole), I think that our tools and methods and values could be translated and appreciated if it were used in different ways, but everything must be rethought. If we insist that people must use everything in the old ways, then we really are obsolete and dead, but I remain convinced there are other ways. I have suggested many ways in my own blog posts and papers.

    But finding these new and different ways would mean real change, and I don’t know if librarians can find it in themselves to do that.