The Association of Research Libraries (ARL), in cooperation with the Association of American Universities (AAU) and the Association of Public and Land-Grant Universities (APLU), offered a proposed solution to the open access mandate of the recent Office of Science and Technology Policy (OSTP) memo. The plan, called the Shared Access Research Ecosystem (SHARE), posits a network of cross-institutional digital repositories based in research universities as the digital home for both the finished papers and the underlying data sets resulting from research produced with federal funds.
If CHORUS, the publisher-based alternative offered by the Association of American Publishers (AAP) last week, stakes its claim on publishers’ expertise and existing platforms for paper delivery, SHARE bases the universities’ claim on their own ownership of key pieces of the existing infrastructure, such as digital institutional repositories, Internet2, and the Digital Preservation Network (DPN), and on the digital data management plans many universities already have in development. (This is potentially a competitive advantage, since the CHORUS plan left the data portion of the mandate unaddressed, at least so far.) Prue Adler, associate executive director of ARL, also pointed to substantial investments by ARL and member universities in workforce development around handling data over the past two years.
John C. Vaughn, AAU’s executive vice president, told the Chronicle of Higher Education, “If we’re going to be building these repositories anyway and want to interconnect them for our own purposes, we’ve got the framework of a system that could manage the content and provide the access that the OSTP directive is calling for.”
How It Would Work
In essence, the SHARE plan works by “adopting a common, brief set of metadata requirements and exposing that metadata to search engines and other discovery tools” to create a “federated, consensus-based system” of existing university-based digital repositories. Like CHORUS, this positions SHARE to claim cost savings (over creating a central digital repository). Discipline-based repositories would be included too, whether they’re housed at universities or not, and agencies could still choose to develop their own or work with PubMed Central or other existing repositories by adopting the same metadata fields and practices to become a linked node.
The minimum standard metadata fields would include author, article title, journal title, abstract, award number, principal investigator ID (ORCID or ISNI), and designated repository number. More could be added over time. The award number essentially mirrors the work of FundRef in the CHORUS plan, tying together funding awards with research output, including data and data management plans, as well as publications.
SHARE would also require information on copyright license, designated repository, and preservation rights to be supplied in a machine-readable format. (Preservation rights would also include what would happen to the final published version if a publisher goes out of business.)
Because not all universities that take federal research funds have their own repository, those that don’t would be able to designate a repository to hold their research. (For the rare state whose university does not have a suitable repository, an institution will partner with another state-funded university, to make sure SHARE is functional for all principal investigators when the federal agency policies go into effect.)
SHARE would have a policy advisory board including representatives of all stakeholders including federal agencies, to ensure interoperability and a single point of contact for the agencies.
The SHARE proposal document says that workflow can be fully automated using existing protocols. If adopted, it would roll out in a four stage process, but would be operational for article deposit and access as soon as the first is completed, which is projected to be within 12 to 18 months. Phase two would be completed six to 12 months after phase one. Much of the deeper functionality, including text and data mining, APIs, and open annotation come in phase three, and linked data in phase four, neither of which yet has a time frame, but Adler noted that the agencies may require some of that functionality to be moved up to day one, after the agency review process.
More than we can chew?
Dorothea Salo, an LJ columnist who formerly ran institutional repositories before becoming an LIS educator, expressed skepticism that SHARE could realize even “a relatively simple networking of the existing ragtag gaggle of institutional repositories,” let alone “a highly complex re-visioning of how the entire research academy deals with digital materials,” on its proposed timeframe, given current IR software and staffing. “I’m very concerned that ARL has run a good way ahead of library reality with SHARE,” she told LJ.
While the cumulative investment in repositories may be considerable, she said, “I know hardly any individual IRs, even in research libraries, with more than three dedicated FTE, and a great many get by on one FTE or less,” a level of investment which she felt was not sufficient to manage SHARE as outlined, particularly since “just the supposed Phase One desiderata are well beyond a lot of IRs because of their software. Salo called out SWORD, OpenURL, and copyright licensing as particular pain points, and said “These and other ‘Phase One’ technological lacunae will be fixed in twelve to eighteen months, according to SHARE? I want to think that’s possible. I just can’t, especially given that a lot of IRs are not well-known or well-regarded enough in their libraries to be likely targets of additional investment.”
SHARE versus CHORUS, or both?
The draft SHARE documents claims that it, compared to “the current publishing structure,” “fully embodies the spirit of the OSTP directive” in maximizing access to federally funded research. And it situates that claim in a business case for maximizing the value of funding.
AAU’s Vaughn told the Chronicle that publishers and universities are likely to disagree about how much should be made available in a federated system and what could be done with that material. “I think we have to keep working with publishers,” he said, (though adding many colleagues did not agree), “but I think publishers need to change the way they operate to become a trusted partner in that system.”
Adler, too, told LJ that working with publishers is very much a possibility. “I don’t see it as an either/or, because I can very much see [CHORUS] partnering within SHARE if they choose to,” she said. But if agencies feel they must choose one or the other, she makes a key point: agencies’ available incentive to induce compliance is via their control of funding, which has a direct impact on the principal investigators and their universities, not the publishers. So if CHORUS fails to meet the requirements, such as for text mining access or for the embargo period, “it puts the PI at risk; there is no leverage that the agency has over the publisher.” She also pointed out that such conflicts have already arisen in the context of the preexisting NIH requirement: “there has been a lot of friction there around deposit issues.”
Although both proposals are preliminary and still open to amendment and elaboration, it is notable that the SHARE document is substantially more detailed than the information offered by CHORUS so far, and more open about how those details can be shared as well. More detailed information about SHARE is offered on infoDOCKET.com, as well as on the ARL website; LJ was asked not to reproduce the CHORUS fact sheet verbatim.
SHARE also offers suggestions for agency guidelines which reflect core library preoccupations: stimulating the development of new tools and services and making sure no entity or group secures rights that would inhibit or prohibit the access and use of them; providing access without login, credentialing, or individual tracking; access for persons with disabilities; and bulk downloads.
If a key point in CHORUS is that publishers would shoulder the cost, payment did not come up in the SHARE proposal. But when asked, ARL’s Adler told LJ that higher education institutions would be prepared to step up and pay the difference as well, and stressed that the additional cost over existing investment would be manageable. “A key part here is, since 2004 we have been assisting campuses with compliance with the NIH policy… Given that we’re looking at approximately 50 percent more articles than NIH, with libraries and higher ed already investing in the IRs and all these other strategies, it will be an incremental cost on top” of what institutions are already spending, she said.
Internally, the next steps for SHARE involve being considered by the executive committees of ARL and AAU on June 18, and APLU on June 20th. At that time if the organizations give them the green light to go further, “that would entail doing a much deeper dive into the technology…and also go into governance and financial issues,” Adler explained.
Externally, SHARE has met with the National Science Foundation representatives, and it is now up to the White House as to whether they want to meet with ARL, AAU, and APLU to discuss the proposal further.
However Adler makes clear that she plans to move ahead with SHARE whether or not the agencies decide this is a workable solution to the OSTP mandate, because the institutions themselves “have to have access at this level… in order to do this kind of research [text mining and computational analysis].”
|Data-Driven Academic Libraries is a free three-part webcast series, developed in partnership with Electronic Resources and Libraries (ER&L), that will touch on just some of the many areas where libraries are gathering, analyzing, and using data to change how they work—fueling your ability to better put this information to work in your own libraries.|