April 29, 2017

Steering an Elephant | Peer to Peer Review

[Note: This column was posted just hours before Judge Baer ruled in HathiTrust’s favor, largely on Fair Use grounds; see LJ’s ongoing coverage for a summary, the full-text of the decision, and supporting links.—Ed.]

There are two things I know about elephants. First, they have long memories. Second, they are large, ponderous beasts, and getting an elephant to move where you want it to go takes care, patience, and agility.

The first of these two “facts” is only a myth that stems from the long lifespan of the elephant. But it is that legendary memory that caused the HathiTrust to name itself with the Hindi word for an elephant (and a character from Kipling’s Jungle Book). As for the second characteristic of an elephant, it came in to my mind as I listened last month to reports about Hathi and marveled at the careful and meticulous work that is being done there to make public domain works accessible to the public. The elephant that is the HathiTrust is indeed being directed with patience and agility.

Creating a System for Copyright Review

I was in Ann Arbor, MI, as a member of an advisory board formed as part of the IMLS-funded effort at Hathi to create a Copyright Review Management System (CRMS). This system has three aspects, each of which raises distinctive challenges and requires well-crafted approaches.

First, CRMS is trying to determine what works, published between 1923 and 1963, are in the public domain in the U.S. due to the failure to follow the then-required formalities of notice and renewal. About 300,000 books have been reviewed so far in this prong of the CRMS, and just over 50 percent have been found to be in the public domain. The review process is careful and includes multiple “fail-safe” checks. The Advisory Board was impressed, I think, and able to help refine the system a little bit, around the edges, as it were.

Even in the U.S., these determinations are not easy. One example that was described to us illustrates why multiple checks are needed. In the first level of review, reviewers (there are at least two at this level) concluded that Hemingway’s The Old Man and the Sea was in the public domain due to lack of a renewal of the copyright. Even though this conclusion proved incorrect, the process they followed was not flawed; at the second level of review it was discovered that the novella was originally published as a serial, and the renewal of the copyright was to be found in a different category of the Copyright Office’s confusing, Byzantine, and sometimes inaccessible records. Yet for every story like this of a mistake that is caught and corrected, the real story is the large number of books that no longer have copyright protection, and that Hathi has been able to make available to scholars and other readers.

The complexities of determining the public domain in the U.S. are minor, however, when compared with those involved in decisions about works of foreign origin. Since those determinations depend on the author’s date of death, involve inconsistent periods of protection, and can lead to different terms of protection in different countries, even deciding the date before which research is not necessary is problematic. Can we imagine a situation in which a work published in 1870, for example, might still be protected in the U.K. or Canada? That was a subject for many hypotheticals (such as the 15-year-old author who lives to be 95) during our day and a half of discussion, and it illustrates why even the most exacting research in this area still involves some consideration of probability and risk.

Lessons for Librarians

Two lessons for librarians and others concerned about the public domain are implicit in this.

First, it is possible for a book to have different terms, and hence enter the public domain at different times, in different countries. James Joyce’s Ulysses is an obvious example; it passed into the public domain in the U.K. on January 1 of this year because Joyce died over 70 years ago, but is probably still protected in the U.S. HathiTrust is able to deal with this anomaly by restricting access based on geographical IP addresses, and anyone researching the public domain needs to consider it.

The second lesson is that the work Hathi is doing with its CRMS project is truly librarians’ work, involving meticulous research and resulting in an important public benefit—greater access to our literary culture. It is being done largely by volunteers at partner libraries, and it is about as worthwhile as anything a librarian can do with a few extra hours in the work day.

Which brings me to the third aspect of the CRMS project, the effort to create a database of rights information uncovered by all this research. The work of the HathiTrust has shifted focus in this regard, from identifying orphan works to collecting and making accessible information about rights holders. This shift is the result of the way the orphan works project unfolded; as researchers set out to uncover a negative—that a rights holder could not be located—they were discovering a lot of positive information about who were rights holders and where they could be contacted. As librarians know, even the negative data about where useful information cannot be found is important to preserve and share. So a database of evidence about potential rights holders is in an early stage at Hathi.

As difficult as public domain determinations can be, this last aspect of the CRMS project is probably the most difficult. Other organizations have tried to create such a registry of information about rights holders without noticeable success. HathiTrust, however, may be in a unique position to make headway; I certainly would not bet against the smart, dedicated people we met during our visit. It would be nice, of course, if this aspect of the project was carried on with the cooperation of groups that represent large sets of rights holders. Instead of getting on board, of course, the Authors Guild chose to sue Hathi, filing a silly lawsuit, in my opinion, that does not serve anyone’s interests, including the authors that the AG represents. But Hathi is making progress anyway, and it is doing it in a way that should not appear threatening to any legitimate copyright holder; the reality, which focuses on supporting a continuing public interest in books even after the statutory rights have expired, is very different from the picture painted by that ill-advised lawsuit.

Indeed, it is this steady, patient approach that Hathi is taking to the whole issue of rights and access that most impressed me. There was nothing combative or defensive about the staff and volunteers we met who are working on these projects, just a determination to do important work in a careful and accurate way. For both writers who want their books read, and librarians who want to facilitate that reading, HathiTrust remains an important and trustworthy ally. For authors whose books have come out of copyright protection due to age or neglect, Hathi offers a new way to put your writings before the public and to stave off obscurity. For writers and publishers who still hold copyright, Hathi is working to make it easier for you to claim those rights and to grant permission for those uses that you think are in your best interest, something the Authors Guild is not willing or able to facilitate. And for librarians who want to connect readers with just the right book for their needs, HathiTrust promises to be a trusted friend, offering a unique and vital service that moves us toward a richer, more literate, digital future.

Kevin L. Smith About Kevin L. Smith

As Duke University’s first Director of Scholarly Communications, Kevin Smith’s (kevin.l.smith@duke.edu) principal role is to teach and advise faculty, administrators and students about copyright,intellectual property licensing and scholarly publishing. He is a librarian and an attorney (admitted to the bar in Ohio and North Carolina) and also holds a graduate degree in religion from Yale University. Smith serves on Duke’s Intellectual Property Board, Digital Futures Task Force and Open Access Advisory Panel. He is also currently the vice chair of the ACRL’s Scholarly Communications Committee. His highly-regarded blog on scholarly communications discusses copyright and publication in academia, and he is a frequent speaker on those topics.

