June 19, 2018

Disappearing Databases

by Carol Tenopir

Just before midnight, September 30, 2002, 19 databases disappeared without a trace from the Dialog online system. The reason? CSA (Cambridge Scientific Abstracts) and Dialog had failed to reach a renewal agreement. On October 1, Dialog searchers received a message “file does not exist” when they entered a CSA file number. A logon message listed the file numbers that were removed. (Admittedly, a notice was sent to subscribers, and a logon warning message ran for a couple of weeks prior to October 1.) This is just the latest case in a disturbing trend of disappearing databases.

No fair deal

CSA is especially strong in materials and environmental sciences, and its departure leaves some notable holes for Dialog searchers. Databases like METADEX, Pollution Abstracts, and ASFA (Aquatic Sciences and Fisheries Abstracts) are widely used. In the last few years, CSA has also added popular social science databases, so users will need to find other means to access Sociological Abstracts, LISA (Library and Information Science Abstracts), and Linguistics and Language Behavior Abstracts. Dialog recommends that librarians substitute Information Science Abstracts and ERIC for LISA.

The missing databases are all on Internet Database Service (IDS), CSA’s own online system, used mostly in academic libraries. Some CSA databases are still available through Ovid, FIZ Karlsruhe’s STN, and OCLC FirstSearch. Most Dialog users will go with FIZ Karlsruhe’s STN because, like Dialog, it offers transactional pricing and is widely used in special libraries. Academic users will likely rely on Ovid, FirstSearch, or IDS.

According to Matt Dunie, president of CSA, the reason CSA files are no longer available through Dialog is that “the two companies…couldn’t come up with a deal we both thought was fair.” He called it a chairman-level decision, saying, “We decided it was time to move in a different direction.” Dunie denied the widespread speculation that Dialog’s lack of broad linking was the cause for CSA’s departure.

CSA pulls out of EBSCOhost

Dunie did affirm that lack of linking is the cause for CSA’s defection from EBSCO Publishing. As of December, CSA files will be pulled because EBSCOhost does not support the broad-based linking that CSA demands.

In September, CSA and its parent, Cambridge Information Group, issued an “open memo to information industry professionals” that said that broad-based linking is essential. Dunie and Cambridge Information Group President James McGinty announced that CSA “is terminating relationships with aggregators and content providers who refuse to allow mutual customers to link to data which they own or license.”

Dunie translated that for me into CSA’s philosophy of linking:

If I produce abstracting and indexing content that points to full text that a customer subscribes to, the user should be able to go to that full text. Likewise, the user should be able to go to the [abstracting and indexing] database from the full text.”

EBSCO Publishing is the first victim of this basic philosophical difference. According to Dunie, CSA has seen the results of linking and is willing to sacrifice in the short run for benefits in the long term. He believes there are advantages for the content provider, aggregator, and customer. These include increased usage, higher renewal rates, lower costs per search, and a better return on investment for the customer.

Linking benefits

CSA/Cambridge is both a bibliographic database producer and a publisher of primary journals. Linking benefits the company from both sides—better functionality in its bibliographic files and more usage of its full texts. But CSA/Cambridge is also an online service, providing access to its own and others’ databases. The removal of the materials from competing online services may bring more customers to its own online service, but it also forces users to switch online services. Dunie told me CSA has “no intention of terminating all distributors.” I hope this is true.

Most librarians would agree that linking has many benefits. Users want to get the full text quickly and easily when they search a bibliographic database and often need a bibliographic citation when they start with an article. Moving back and forth among sources from many different producers is rapidly becoming the expectation.

The disappearance of valuable content from an aggregator’s service is immediately inconvenient. At University of Tennessee, Knoxville, our serials budget was depleted when the CSA announcements were made—we’ll have to do without valuable content for some time. But if the issue truly is linking, we will all benefit in time. If the concern becomes disaggregation—each content creator forcing use of its own online service—the benefits disappear.


In the interest of providing access to the most journals with limited funds, libraries sometimes cancel subscriptions if they receive the articles through full-text aggregators. Aggregators provide the same benefits subscription agents always have, plus offer a limited number of interfaces for accessing a wide range of journals.

But what benefits libraries can be disastrous to primary publishers. Sage Publications became the first major scholarly publisher to remove its journals from aggregator services. In July, Sage announced that its journals would no longer be available on EBSCOhost and ProQuest.

According to Carol Richman, director of licensing and electronic publishing, and Blaise Simqu, executive vice president, Sage Publications, this withdrawal is a direct result of library cancellations. They claim that there is a clear relationship between the availability of Sage content in aggregated databases and subscription cancellations. “The royalties earned from the database aggregators is very small, and Sage journals cannot sustain themselves without subscription revenue,” they said. “Content in the aggregated databases literally threatens the long-term future of every journal we publish.”

Both EBSCOhost and ProQuest carried all Sage titles. Other aggregator services, such as Gale and JSTOR, have a few Sage titles still available. Sage is now evaluating those licenses. Third parties that provide full-text access if a customer has a subscription to the journal are a different story. Sage journals are still available through those services, such as Ingenta, Minerva, and others.

It should come as no surprise that Sage is building its own online full-text service, called Sage Full-Text Collections and would, of course, like customers to purchase access to this service. Richman and Simqu acknowledge the many comments from librarians who “are understandably concerned about their own collection development and serving their patrons.”

Sage journals won’t disappear from EBSCOhost and ProQuest until December 2003. After that time, ProQuest will continue to provide bibliographic information for articles from Sage publications. Both EBSCOhost and ProQuest will actively add titles from other publishers to make up for the losses.

Newspapers first to opt out

Sage is the first major scholarly publisher to disappear from aggregators, but several newspaper publishers have also recently removed content from aggregator services.

Tribune newspapers, including the Los Angeles Times and Baltimore Sun, were removed (at least in part) from LexisNexis Academic and from Dialog in September, both with less than two weeks’ notice. LexisNexis Academic has removed the full-text archive of these papers, maintaining just a six months’ “rolling” file. Dialog removed the complete files for the Chicago Tribune, Fort Lauderdale Sun Sentinel, Orlando Sentinel, and Newport News Daily Press, among others.

Perhaps the publishers hope to keep online customers to themselves by forcing users to come directly to their web sites or online services. With many academic libraries already subscribing to hundreds of online systems, this could make life difficult.

The desire to keep customers to themselves is not the only reason why newspaper content is disappearing from aggregator services; the 2001 Supreme Court ruling in the Tasini case is another factor.

Tasini and parts of files

When an entire newspaper disappears, at least users know what they aren’t getting. After the Supreme Court ruling in the Tasini case favored freelance authors’ rights to be compensated for electronic republication by publishers and aggregators, online back files or individual articles in newspapers and newsmagazines began to disappear. Gannett and Knight-Ridder both removed back files to examine them for Tasini violations. Factiva listed newspaper files that were no longer available and others that could no longer be considered complete.

Some content providers chose to remove everything that preceded their electronic publishing agreements with authors. Others removed just specific articles and reloaded the somewhat depleted file. Sometimes files reappear; sometimes they disappear without a trace. Users never quite know what they will find from day to day.

Full-text services such as ProQuest, Gale, and LexisNexis try to leave a bibliographic citation in place when a full-text article is removed. This may include an abstract if the author didn’t write it.

Government content gone, too

Government databases are also disappearing. The most publicized case is the U.S. Department of Energy’s (DOE) PubSCIENCE (it may be gone by the time you read this). Although at the end of October it could still be found at its web site, there is no longer a link to it from DOE’s Office of Scientific and Technical Information web site. A warning on the PubSCIENCE site states, “The U.S. Department of Energy proposes to discontinue PubSCIENCE” and “a decision will be announced in the near future.”

Perhaps more telling, the PubSCIENCE web site directs users to the commercial services, called “other scientific resources”: Elsevier Science’s Scirus and Infotrieve Article Finder.

Although the quality of PubSCIENCE has been questioned (see Peter Jacso’s column in Information Today, 10/1/02), it is troubling to have entire major systems disappear. Many librarians have built links to PubSCIENCE or recommended it to their users. Although it had not achieved the size or quality of PubMED, some hoped that PubSCIENCE would achieve that stature.

Who loses?

There are many reasons why online content can disappear. Some may be temporary, others permanent. Some hit libraries squarely in their budgets. Occasionally we find other sources or services that may serve just as well.

Still, this year’s rash of disappearances is disturbing. If content providers all decide to go their own way and abandon aggregator services, or if good content disappears without a trace, libraries and their users will be the ultimate losers.

Author Information
Carol Tenopir (ctenopir@utk.edu) is Professor at the School of Information Sciences, University of Tennessee, Knoxville


Link List

Cambridge Scientific Abstracts




Sage Publications

Maker Workshop
In this two-week online course, you’ll create a maker program that aligns with your budget and community needs, with personal coaching from maker experts—from libraries and beyond—May 23 & June 6, 2018.