October 23, 2017

Endangered Data Week Highlights Need for Digital Preservation of Government Data

Endangered Data Week logoAiming to raise awareness and maintain momentum for preservation efforts focused on publicly administered data, the inaugural “Endangered Data Week” kicked off on April 17 and ultimately featured more than 50 presentations, panels, and projects in the United States, Spain, and Australia.

The initiative was led by Brandon Locke, digital social science and humanities specialist with Michigan State University’s (MSU) history department, in collaboration with Jason A. Heppler, academic technology specialist for Stanford University’s department of history; Bethany Nowviskie, director of the Council on Libraries and Information Resources’ (CLIR) Digital Library Federation (DLF); and Wayne Graham, technical director for CLIR.

The project was spurred into action by recent political events—the Trump administration ordering the U.S. Environmental Protection Agency to remove climate change information from its website, and the passage of H.Res.5 by the House of Representatives, excluding changes to the Affordable Care Act from mandatory long-term cost data analysis, to name two.

Locke said that recent Data Rescue events—organized by groups such as DataRefuge and the Environmental Data & Governance Initiative (EDGI) to back up data sets, documents, images, and other content from government websites believed to be vulnerable to takedown by the Trump administration—got him thinking about ways to help. The project was inspired by events such as Banned Books Week and International Open Access Week.

Banned Books Week “shines light on books that are endangered almost entirely for political reasons…. That’s the way that librarians, particularly, have tried to fight against censorship, by having open readings, by putting up displays of books that are often challenged, and by talking about it,” he told LJ.

And like Banned Books Week, Locke hopes that Endangered Data Week will annually direct attention to an issue of ongoing interest and concern for librarians going forward, regardless of the political climate.

“It really goes beyond the Trump administration,” Locke explained. “There’s a lot of public data out there, and there are a lot of ways in which it could be lost, whether it’s political silencing, or budget cuts, or mismanagement. It’s really a matter of rethinking the infrastructure for all of this digital information, be it data or government documents.”

Locke tweeted the idea in early February, stating that “We need Banned Data Week. Like Banned Books [Week], draw attn. to HUD, EPA data being suppressed & encourage data literacy, engaged scholarship.” Respondents began refining the concept almost immediately, offering to collaborate and suggesting “Endangered Data” as a more proactive name that would resonate with archivists, and would encompass the broader variety of issues that can lead to the loss of publicly administered data.

Quick turnaround

Although there’s no official headcount for the number of attendees, the idea clearly resonated within the library community, getting off to a strong start with a varied group of 57 events organized and listed on Endangered Data Week’s site. There was an Endangered Data Week/Data Refuge “dine around” at the Research Data Access & Preservation (RDAP) Summit in Seattle. The University of Virginia Library hosted a special weeklong series of events including workshops and presentations on topics ranging from Web Scraping in R to Cultural Heritage Informatics. The University of Nebraska–Lincoln Libraries hosted a weeklong, student-targeted exhibit on personal digital archiving and an introductory presentation titled “Endangered Data: What is it and how can I help?” Several university libraries hosted viewing parties for a DLF-sponsored webinar on the “Freedom of Information Act, Government Data, and Transparency”; several others used the occasion to raise awareness by hosting data rescue events or letter writing campaigns regarding existing or proposed policies that would have an impact on publicly created data.

“It varied a lot, and we ran this on a really short timeframe, just given the urgency of the issue, and also the fact that the semester was ending [at most academic institutions] and the energy would be gone after that,” Locke said, crediting Nowviskie and Rachel Mattson, leader of DLF’s group on Government Records Transparency/Accountability, with playing a crucial role in getting the word out quickly through their network of members and contacts.

“A lot of people were just coming up with stuff on a couple of weeks’ notice, so it’s really exciting to me that we had as many [affiliated] events as we did, given that short time frame. Here at Michigan State, we had eight events, and that was something we were really dealing with—locking things down, getting abstracts up, promoting it, and then hosting it all in a really short time frame.”

In addition to Nowviskie, Mattson, and other contributors listed above, other key collaborators included Sarah Melton, head of digital scholarship at Boston College (BC); Anna Kijas, senior digital scholarship librarian, BC; Purdom Lindblad, assistant director of innovation and learning at the Maryland Institute for Technology in the Humanities; Kristen Mapes, digital humanities coordinator for MSU’s College of Arts and Letters; and at DLF, Katherine Kim and Becca Quon. DataRefuge, Mozilla Science Lab, the National Digital Stewardship Alliance, and CLIR joined DLF as project sponsors.

Preserving Energy

Many challenges lie ahead for the budding data rescue and endangered data movements. In addition to issues such as storage and ongoing maintenance that would be typical for any digital preservation project, this data exists on sites with established credibility and public familiarity. A K–12 teacher, for example, might direct students to epa.gov/climatechange as a trustworthy source of free information on the topic. Saving and migrating this type of content to nongovernment affiliated domains could pose problems with discoverability and raise questions of provenance.

“Those are questions that the Libraries+ Network, particularly, has been working with,” Locke said. “They’re closely associated with DataRefuge…. One of the big steps for the Libraries+ Network is to rethink a sustainable model for backing up or providing alternative access to these types of materials.”

This will be a major, long-term undertaking, and Locke plans for Endangered Data Week to continue next year, drawing attention to these preservation efforts and the work that’s being done.

“There was a lot of energy immediately after the election, but this is something that, again, is not necessarily tied to the Trump administration, and [the need for it] is not going to go away,” Locke said.

Matt Enis About Matt Enis

Matt Enis (menis@mediasourceinc.com; @matthewenis on Twitter) is Senior Editor, Technology for Library Journal.

Share
School Library Journal’s newest installment of Maker Workshop will feature up-to-the-minute content to help you develop a rich maker program for your library. During this 4-week online course, you’ll hear directly from expert keynote speakers doing inspiring work that you can emulate, regardless of your library’s size or budget. Course sessions will explore culturally relevant making and how to assess your community’s needs, mobile maker spaces, multi-media, and more!
Design Institute Heads to Washington!
On Friday, October 20, in partnership with Fort Vancouver Regional Library—at its award-winning Vancouver Community Library (WA)—the newest installment of Library Journal’s building and design event will provide ideas and inspiration for renovating, retrofitting, or re-building your library, no matter your budget!

Comments

  1. Carol Waggoner-Angleton says:

    Locke is correct about that there has been an endangered data problem for a while. However, if the beginnings of the Trump administration were what spurred this effort there needs to be some honest soul searching about motivation. We have seen the government erect paywalls around public data with the sale of Statistical Abstracts to EBSCO , restrict access to data, i.e the shutdown of the American Memory sites in the last governmental “shutdown” and efforts to present a truncated data in the discussions about the collection of the next census data. While we know that paper materials are much easier to preserve than electronic, many of our libraries, especially academic libraries with government repository status, are headed by directors with a “all government publications are online” mentality. This attitude threatens materials created before 1996 since we don’t have good information on how much of the backlog has been digitized

Comment Policy:
  1. Be respectful, and do not attack the author, people mentioned in the article, or other commenters. Take on the idea, not the messenger.
  2. Don't use obscene, profane, or vulgar language.
  3. Stay on point. Comments that stray from the topic at hand may be deleted.
  4. Comments may be republished in print, online, or other forms of media, per our Terms of Use.

We are not able to monitor every comment that comes through (though some comments with links to multiple URLs are held for spam-check moderation by the system). If you see something objectionable, please let us know. Once a comment has been flagged, a staff member will investigate.

We accept clean XHTML in comments, but don't overdo it and please limit the number of links submitted in your comment. For more info, see the full Terms of Use.

Speak Your Mind

*