By putting distribution and storage of papers and datasets in the hands of their authors, Academic Torrents brings even more DIY ethos to the world of academic publishing, and may help to solve a few problems in the field in the bargain. While libraries and colleges disintermediate scholarly publishing by hosting their own institutional repositories and backing up to offsite services like LOCKSS and Portico, Academic Torrents goes a step further, offering researchers the opportunity to distribute the hosting of their papers and datasets to authors and readers, providing easy access to scholarly works and simultaneously backing them up on computers around the world.
The brainchild of Joseph Paul Cohen and Henry Lo, graduate students in the computer science department at the University of Massachusetts-Boston (UMB), Academic Torrents puts torrenting technology—long employed by users of sites like The Pirate Bay to download music and movies—to make scholarly content more easily available while also offering authors, libraries, and other common hosts a new way to store their work that doesn’t take up server space.
The common, single server model, where a link points to the download for a file stored on a server somewhere is simple but far from ideal, according to Cohen. “You have a single point of failure because the storage is in just one place,” Cohen told Library Journal. “And the limited bandwidth of one server means that model doesn’t scale well.” If only one person is trying to download from a single server, for example, they have a lot of bandwidth available for a fast download. If 100 people are trying to download from that link, though, they only have 1 percent each of the bandwidth available, slowing the process.
Torrents differ from direct downloads in that they use “seeds” of a file drawn from computers around the Internet, each of which is hosting an identical copy of the file in question. Rather than directly downloading a file from one host computer, torrent services let interested parties download bits from each copy, or “mirror,” of the file before reassembling it on the downloading computer courtesy of torrent tracking programs like uTorrent. “People become mirrors for the file while they’re downloading it,” Cohen said. “The more people are downloading something, the faster the downloads speed will be.”
To host material on Academic Torrents, study authors each host a copy of their paper or data on a desktop. People interested in reading the paper or studying the dataset could then download the torrent, pulling data from each copy. For large datasets, which can be a problem for information hosting services, torrents offer an opportunity for distributing the load, as a team of people could each host a portion of the data as a seed, allowing users to download parts from each host and then assemble them into a whole.
That kind of distributed hosting makes Academic Torrents a good place for libraries to participate by hosting data, without having to worry that they’re the only source for it. “Libraries can … host papers from their own campus without becoming the only source of the data,” says a statement on the Academic Torrents website. “So even if a library’s system is broken, other universities can participate in getting that data into the hands of researchers.”
Of course, allowing researchers to share whatever they want, from datasets to published articles, over the service opens the door to some of the same problems that have plagued more general interest torrent sites—namely, that users may begin sharing items they lack the publishing rights to. The site is careful to note that uploaders must have the legal right to share and distribute files they make available over the service. A warning to that effect is emblazoned on the upload page, and users must note what use license—General Public License, Creative Commons, etc.—the files are uploaded under. Cohen told LJ that he and fellow programmer Henry Lo are interested in being a repository for information, not a home for illegal downloads, and intend to comply with reasonable takedown requests from copyright holders, though they haven’t had to deal with such problems in the site’s brief history so far.
In the same form where they note the work’s license, authors and article hosts note the title of the paper or dataset and provide a brief abstract or description, tags to improve searchability, and links to where users can download the file directly if it is hosted elsewhere. Those direct downloads are still simpler for most individual papers; the service is geared to improving the speed and reliability of delivering data sets. To date, Academic Torrents is hosting 1.68 terrabytes (TB) of material from researchers and organizations around the globe, ranging from the official offline edition of Wikipedia to 2011 weather data collected by the National Oceanic and Atmospheric Administration (NOAA).