June 18, 2018

ARL Launches Library-Led Solution to Federal Open Access Requirements

The Association of Research Libraries (ARL), in cooperation with the Association of American Universities (AAU) and the Association of Public and Land-Grant Universities (APLU), offered a proposed solution to the open access mandate of the recent Office of Science and Technology Policy (OSTP) memo. The plan, called the Shared Access Research Ecosystem (SHARE), posits a network of cross-institutional digital repositories based in research universities as the digital home for both the finished papers and the underlying data sets resulting from research produced with federal funds.

If CHORUS, the publisher-based alternative offered by the Association of American Publishers (AAP) last week, stakes its claim on publishers’ expertise and existing platforms for paper delivery, SHARE bases the universities’ claim on their own ownership of key pieces of the existing infrastructure, such as digital institutional repositories, Internet2, and the Digital Preservation Network (DPN), and on the digital data management plans many universities already have in development. (This is potentially a competitive advantage, since the CHORUS plan left the data portion of the mandate unaddressed, at least so far.) Prue Adler, associate executive director of ARL, also pointed to substantial investments by ARL and member universities in workforce development around handling data over the past two years.

John C. Vaughn, AAU’s executive vice president, told the Chronicle of Higher Education, “If we’re going to be building these repositories anyway and want to interconnect them for our own purposes, we’ve got the framework of a system that could manage the content and provide the access that the OSTP directive is calling for.”

How It Would Work

In essence, the SHARE plan works by “adopting a common, brief set of metadata requirements and exposing that metadata to search engines and other discovery tools” to create a “federated, consensus-based system” of existing university-based digital repositories. Like CHORUS, this positions SHARE to claim cost savings (over creating a central digital repository). Discipline-based repositories would be included too, whether they’re housed at universities or not, and agencies could still choose to develop their own or work with PubMed Central or other existing repositories by adopting the same metadata fields and practices to become a linked node.

The minimum standard metadata fields would include author, article title, journal title, abstract, award number, principal investigator ID (ORCID or ISNI), and designated repository number. More could be added over time. The award number essentially mirrors the work of FundRef in the CHORUS plan, tying together funding awards with research output, including data and data management plans, as well as publications.

SHARE would also require information on copyright license, designated repository, and preservation rights to be supplied in a machine-readable format. (Preservation rights would also include what would happen to the final published version if a publisher goes out of business.)

Because not all universities that take federal research funds have their own repository, those that don’t would be able to designate a repository to hold their research. (For the rare state whose university does not have a suitable repository, an institution will partner with another state-funded university, to make sure SHARE is functional for all principal investigators when the federal agency policies go into effect.)

SHARE would have a policy advisory board including representatives of all stakeholders including federal agencies, to ensure interoperability and a single point of contact for the agencies.

The SHARE proposal document says that workflow can be fully automated using existing protocols. If adopted, it would roll out in a four stage process, but would be operational for article deposit and access as soon as the first is completed, which is projected to be within 12 to 18 months. Phase two would be completed six to 12 months after phase one. Much of the deeper functionality, including text and data mining, APIs, and open annotation come in phase three, and linked data in phase four, neither of which yet has a time frame, but Adler noted that the agencies may require some of that functionality to be moved up to day one, after the agency review process.

Phases of SHARE

More than we can chew?

Dorothea Salo, an LJ columnist who formerly ran institutional repositories before becoming an LIS educator, expressed skepticism that SHARE could realize even “a relatively simple networking of the existing ragtag gaggle of institutional repositories,” let alone “a highly complex re-visioning of how the entire research academy deals with digital materials,” on its proposed timeframe, given current IR software and staffing. “I’m very concerned that ARL has run a good way ahead of library reality with SHARE,” she told LJ.

While the cumulative investment in repositories may be considerable, she said, “I know hardly any individual IRs, even in research libraries, with more than three dedicated FTE, and a great many get by on one FTE or less,” a level of investment which she felt was not sufficient to manage SHARE as outlined, particularly since “just the supposed Phase One desiderata are well beyond a lot of IRs because of their software. Salo called out SWORD, OpenURL, and copyright licensing as particular pain points, and said “These and other ‘Phase One’ technological lacunae will be fixed in twelve to eighteen months, according to SHARE? I want to think that’s possible. I just can’t, especially given that a lot of IRs are not well-known or well-regarded enough in their libraries to be likely targets of additional investment.”

SHARE versus CHORUS, or both?

The draft SHARE documents claims that it, compared to “the current publishing structure,” “fully embodies the spirit of the OSTP directive” in maximizing access to federally funded research. And it situates that claim in a business case for maximizing the value of funding.

AAU’s Vaughn told the Chronicle that publishers and universities are likely to disagree about how much should be made available in a federated system and what could be done with that material. “I think we have to keep working with publishers,” he said, (though adding many colleagues did not agree), “but I think publishers need to change the way they operate to become a trusted partner in that system.”

Adler, too, told LJ that working with publishers is very much a possibility. “I don’t see it as an either/or, because I can very much see [CHORUS] partnering within SHARE if they choose to,” she said. But if agencies feel they must choose one or the other, she makes a key point: agencies’ available incentive to induce compliance is via their control of funding, which has a direct impact on the principal investigators and their universities, not the publishers. So if CHORUS fails to meet the requirements, such as for text mining access or for the embargo period, “it puts the PI at risk; there is no leverage that the agency has over the publisher.” She also pointed out that such conflicts have already arisen in the context of the preexisting NIH requirement: “there has been a lot of friction there around deposit issues.”

Although both proposals are preliminary and still open to amendment and elaboration, it is notable that the SHARE document is substantially more detailed than the information offered by CHORUS so far, and more open about how those details can be shared as well. More detailed information about SHARE is offered on infoDOCKET.com, as well as on the ARL website; LJ was asked not to reproduce the CHORUS fact sheet verbatim.

SHARE also offers suggestions for agency guidelines which reflect core library preoccupations: stimulating the development of new tools and services and making sure no entity or group secures rights that would inhibit or prohibit the access and use of them; providing access without login, credentialing, or individual tracking; access for persons with disabilities; and bulk downloads.

If a key point in CHORUS is that publishers would shoulder the cost, payment did not come up in the SHARE proposal. But when asked, ARL’s Adler told LJ that higher education institutions would be prepared to step up and pay the difference as well, and stressed that the additional cost over existing investment would be manageable. “A key part here is, since 2004 we have been assisting campuses with compliance with the NIH policy… Given that we’re looking at approximately 50 percent more articles than NIH, with libraries and higher ed already investing in the IRs and all these other strategies, it will be an incremental cost on top” of what institutions are already spending, she said.

What’s Next?

Internally, the next steps for SHARE involve being considered by the executive committees of ARL and AAU on June 18, and APLU on June 20th. At that time if the organizations give them the green light to go further, “that would entail doing a much deeper dive into the technology…and also go into governance and financial issues,” Adler explained.

Externally, SHARE has met with the National Science Foundation representatives, and it is now up to the White House as to whether they want to meet with ARL, AAU, and APLU to discuss the proposal further.

However Adler makes clear that she plans to move ahead with SHARE whether or not the agencies decide this is a workable solution to the OSTP mandate, because the institutions themselves “have to have access at this level… in order to do this kind of research [text mining and computational analysis].”

Meredith Schwartz About Meredith Schwartz

Meredith Schwartz (mschwartz@mediasourceinc.com) is Executive Editor of Library Journal.

Fund Your Library: Tools and Tactics for Getting to Yes!
Whether you’re going to voters, city councils, school boards, college board of directors, or any other funder, the fundamental issues are the same: how do you convince the stewards of a limited budget that the library is their best investment?



    Institutions and their libraries should be working on implementing the federal funder Green OA self-archiving mandates as well as on adopting complementary institutional Green OA self-archiving mandates for all of their research output, not just the federally funded fraction.

    But on no account make common cause with CHORUS. Make sure everything is deposited in your institutional repositories and then interoperability, federation, harvesting, import, export will all take care of themselves quite naturally.

    But not if you get in league with publishers, whose interest is in delaying access and retaining their hold on the infrastructure hosting the content.

    Federation is fine, but a red herring if it distracts libraries from the need to adopt and implement effective funder and institutional deposit mandates and tempts them to accept the publishers’ Trojan Horse.

    Harnad, S. (2013) “CHORUS”: Yet Another Trojan Horse from the Publishing Industry. Open Access Archivangelism 1009

  2. In principle I like the approach that SHARE is taking, that of leveraging the existing network of institutional repositories, and the amazingly decentralized thing that is the Internet and the World Wide Web. Simply getting article content out on the Web, where it can be crawled, as Harnad suggests, has bootstrapped incredibly useful services like Google Scholar. Scholar works with the Web we have, not some future Web where we all share metadata perfectly using formats that will be preserved for the ages. They don’t use OpenURL, OAI-ORE, SWORD, etc. They do have lots o’ crawlers, and some magical PDF parsing code that can locate citations. I would like to see a plan that’s a bit scruffier and less neat.

    Like Dorothea I have big doubts about building what looks to be a centralized system that will then push out to IRs using SWORD, and support some kind of federated search with OpenURL. Most IRs seem more like research experiments than real applications oriented around access, that could sustain the kind of usage you might see if mainstream media or a MOOC happened to reference their content. Rather than a 4 phase plan, with digital library acronym soup,I’d rather see some very simple things that could be done to make sure that federally funded research *is* deposited in an IR, and it can be traced back to the grant that funded it. Of course, I can’ resist to throw out a straw man.

    Requiring funding agencies to have a URL for each grant, which can be used in IRs seems like it would be the first logical step. Pinging that URL (kind of like a trackback) when there is a resource (article, dataset, etc) associated with the grant would allow the granting institution to know when something was published that referenced that URL. The granting organization could then look at its grants and see which ones lacked a deposit, and follow up with the grantees. They could also examine pingbacks to see which ones are legit or not. Perhaps further on down the line these resources could be integrated into web archiving efforts, but I digress.

    There would probably be a bit of curation of these pingbacks, but nothing a big Federal Agency can’t handle right? I think putting data curation first, instead of last, as the icing on the 4 phase cake is important. I don’t underestimate the challenge in requiring a URL for every grant, perhaps some agencies already have them. I think this would put the onus on the Federal agencies to make this work, rather than the publishers (who, like or not, have a commercial incentive to not make it too easy to provide open access) and universities (who must have a way of referencing grants if any of their plan is to work). This would be putting Linked Data first, rather than as sprinkles on the cake.

    Sorry if this comes off as a bit ranty or incomprehensible. I wish Aaron were here to help guide us… It is truly remarkable that the OSTP memo was issued, and that we have seen responses from the ARL and the AAP. I hope we’ll see responses from the federal agencies that the memo was actually directed at.