Pirate Website Unlocks Access to Scholarly Literature

Science on the Hill

Saint Joseph's University

Landmark Americana City Line

6:00 PM Thursday, January 18, 2018

Online at slides.com/dhimmel/sju

Greene Lab

I'm a data scientist

http://www.greenelab.com/

Event details:

Dr. Daniel Himmelstein, a data scientist currently working out of the University of Pennsylvania's Greene Lab, will be sharing his story about breaking down the "toll access" publication model, which prevents the public from accessing academic articles. Using computer and data science, Dr. Himmelstein will show you how websites like Sci-Hub are already making primary science articles easier to access than ever before! Join us Thursday January 18th 6:00-8:00PM at Landmark Americana for some delicious food, inspiring science, and post-holiday merriment!

Background: Most academic research is funded by the public or private philanthropy. Nonetheless, even in 2017, the majority of new scholarly articles are paywalled. However, a controversial alternative to purchasing article access is emerging. The website Sci-Hub provides access to scholarly literature via fulltext PDF downloads. The site enables users to access articles that would otherwise be paywalled. However, Sci-Hub's hosting of articles is often copyright infringement, and two publishers have already won suits against Sci-Hub.

In March, Sci-Hub tweeted the identifiers (DOIs) for all articles in their repository. By integrating this dataset with a catalog of scholarly literature, we assessed Sci-Hub's coverage and found that Sci-Hub contains 86% of articles in toll-access journals. This number rises to 96% for recently-cited articles.

We suggest the ubiquity of Sci-Hub will disrupt scholarly publishing. Specifically, toll access publishing will no longer be a viable business model. We provide evidence that the transition is already underway and urge scholars to adopt libre open access as an alternative. This study was performed openly online at https://github.com/greenelab/scihub and can be read at https://doi.org/b9s5.

Himmelstein DS, Romero AR, McLaughlin SR, Greshake Tzovaras B, Greene CS. (2017) Sci-Hub provides access to nearly all scholarly literature. PeerJ Preprints DOI: 10.7287/peerj.preprints.3100

​Sci-Hub is available at:

  • https://sci-hub.hk
    Hong Kong
  • https://sci-hub.la
    Laos
  • https://sci-hub.mn
    Mongolia
  • https://sci-hub.name
    Generic
  • https://sci-hub.tv
    Polynesian island nation of Tuvalu
  • https://sci-hub.tw
    Taiwan
  • scihub22266oqcxt.onion
    Tor Hidden Service (dark web)

​Sci-Hub is available at:

  • https://sci-hub.cc
    Territory of Cocos (Keeling) Islands
  • https://sci-hub.io
    British Indian Ocean Territory
  • https://sci-hub.ac
    Saint Helena, Ascension and Tristan da Cunha
  • https://sci-hub.bz
    Belize
  • scihub22266oqcxt.onion
    Tor Hidden Service (dark web)

Ⓐ 2011-09-05: created by Alexandra Elbakyan, the Sci-Hub website goes live

🔒

2013-03-20: Sci-Hub switches to using LibGen as a repository to cache articles.

Ⓑ 2015-01-04: LibGen domain name registrations expire after site administrator dies from cancer.

Ⓒ 2015-06-03: Elsevier files a civil suit against Sci-Hub and LibGen in the U.S. District Court for Southern NY.

Image 3850

2015-10-30: Elsevier is granted a preliminary injunction to suspend domain names. Bye sci-hub.org

2016-02-10: “Meet the Robin Hood of Science” by Simon Oxenham

The New York Times:

Should All Research Papers Be Free?

Alexandra Elbakyan

Ⓕ 2016-04-29: Who’s downloading pirated papers? Everyone” by John Bohannon in Science

https://doi.org/bf37

Ⓗ 2016-04-29: Elsevier wins a default judgement ordering defendants to pay Elsevier $15 million.

Representative work #28

Ⓘ 2016-06-23: The American Chemical Society files suit against Sci-Hub in the Eastern District of Virginia..

Ⓚ 2017-09-05: Sci-Hub blocks access to Russian IP addresses due to disputes with the scientific establishment.

Idiogramma elbakyanae

2017-11-03: ACS wins suit against Sci-Hub

  • Ordered that any person or entity in active concert or participation with Defendant Sci-Hub and with notice of the injunction, including any Internet search engines, web hosting and Internet service providers, domain name registrars, and domain name registries, cease facilitating access to any or all domain names and websites through which Sci-Hub engages in unlawful access to, use, reproduction, and distribution of ACS’s trademarks or copyrighted works.
  • Computer and Communications Industry Association (CCIA) filed an amicus brief (rejected) regarding the suits targeting of "Neutral Service Providers"
  • ACS Mission: To advance the broader chemistry enterprise and its practitioners for the benefit of Earth and its people.

Ⓛ December 2017: Search interest spikes as domains are suspended after ACS judgement.

  • https://github.com/greenelab/scihub
  • https://github.com/greenelab/scihub-manuscript
  • https://github.com/greenelab/crossref
  • https://github.com/dhimmel/scopus
  • https://github.com/greenelab/scihub-browser-data

But what scholarly articles are not in Sci-Hub?

  • There are 10 DOI Registration Agencies
  • Crossref has registered 67% of all DOIs in existence
  • In March 2015, 99.9% of English Wikipedia DOI links were registered via Crossref
  • 90% of newly published articles in the sciences have DOIs
  • Catalog of 87,542,370 DOIs
  • cAsE InSENSITive

Metadata for porn from the Entertainment Identifier Registry

Study at https://doi.org/b9s5

49% of 2.8 million articles

85% of 54 million articles

Currently, the Sci-Hub does not store books, for books users are redirected to LibGen, but not for research papers. In future, I also want to expand the Sci-Hub repository and add books too.

Elbakyan (2017)

Data from "The State of OA" Study https://doi.org/gbqtxd

Data from "The State of OA" Study https://doi.org/gbqtxd

  • Extracted DOI citations from OpenCitations
  • Recent studies (since 2015) had 6,252,279 outgoing citations to articles in toll access journals
  • 96.2% in Sci-Hub

Coverage of cited articles

https://github.com/greenelab/library-access

How do oaDOI & Sci-Hub compare to the access of University of Pennsylvania?

Jacob Levernier

Monthly Bitcoin Donations

As of December 31, 2017:

  • Three known bitcoin addresses
  • received 1,232 donations, totaling ₿94.494
  • $69,224 US at time of donation
  • $421,272 US at time of withdrawal with ₿9.027 remaining
  • Sci-Hub tweeted: “the information on donations … is not very accurate, but I cannot correct it: that is confidential.”

While this study had a number of interesting aspects, its virtual lack of success as a tool for reducing the library's journal budget was largely due to the fact that the overall problem was seen by everyone concerned as a library problem. As such, the only solution available to the library in 1981 was to use monograph and binding funds to help offset the shortfall in the serials and journals budget. While the biology and chemistry libraries were spared drastic cuts because of very generous support from divisional funds, Caltech's engineering libraries were extremely hard hit, and only now after nearly seven years have they recovered (just in time for the current crisis). It should be pointed out here that from 1974 to 1983 the materials budgets for the departmental libraries were the responsibility of appropriate divisions.

Serials Crisis

Dana Roth (1990) "The Serials Crisis Revisited"

The Serials Librarian. https://doi.org/dvwb7f

Dana Roth (1990) "The Serials Crisis Revisited"

The Serials Librarian. https://doi.org/dvwb7f

Source: Association of Research Libraries. Expenditure Trends in ARL Libraries, 1986–2015

Prices 1986–2015

  1. Inflation — 118%
  2. Library expenditures — 197%
  3. Journal subscriptions 521%

Libre Open Access

Headlines:

  • Science: Sci-Hub’s cache of pirated papers is so big, subscription journals are doomed, data analyst suggest
  • Inside Higher Ed: Inevitably Open
  • Quartz: A pirating service for academic journal articles could bring down the whole establishment

https://doi.org/b9s5

Sci-Hub  ⇒ open scholarly literature?

What library will continue to subscribe if a growing proportion of articles is available for free elsewhere?
Tom Reller (2013) Vice President, Elsevier

Defendants’ actions also threaten imminent irreparable harm to Elsevier because it appears that the Library Genesis Project repository may be approaching (or will eventually approach) a level of “completeness” where it can serve as a functionally equivalent, although patently illegal, replacement for ScienceDirect.

DeMarco, Hirschberg & Sen (2015) Attorneys for Elsevier

100 articles every ecologist should read

Courchamp & Bradshaw (2017) Nature Ecology & Evolution https://doi.org/cf8f

Changing times? Thus far in 2017

  • University of Montreal cut 2,231 journal subscriptions from Taylor & Francis (93%)
  • Universities in the Netherlands dropped their Oxford University Press subscription
  • ​Germany, Peru, and Taiwan entered 2017 without  Elsevier deals after negotations reached impasses
  • Preprint growth

Libre Open Access

https://greenelab.github.io/scihub-manuscript

Manubot

powering the next generation of scholarly manuscript

Get started at tiny.cc/manubot

 

https://github.com/greenelab/manubot-rootstock

The Manubot project began with the [Deep Review](https://github.com/greenelab/deep-review),
where it was used to compose a highly-collaborative review article [@doi:10.1101/142760].
Other manuscripts that were created with Manubot include:

+ The Sci-Hub Coverage Study
  ([GitHub](https://github.com/greenelab/scihub-manuscript), [HTML manuscript](https://greenelab.github.io/scihub-manuscript/)) 
  [@doi:10.7287/peerj.preprints.3100]
+ Michael Zietz's Report for the Vagelos Scholars Program
  ([GitHub](https://github.com/zietzm/Vagelos2017), [HTML manuscript](https://zietzm.github.io/Vagelos2017/)) 
  [@doi:10.6084/m9.figshare.5346577]

The Manubot project began with the Deep Review, where it was used to compose a highly-collaborative review article [1]. Other manuscripts that were created with Manubot include:

1. Opportunities And Obstacles For Deep Learning In Biology And Medicine
Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones, Alexandr A. Kalinin, Brian T. Do, Gregory P. Way, Enrico Ferrero, Paul-Michael Agapow, Wei Xie, Gail L. Rosen, … Casey S. Greene
Cold Spring Harbor Laboratory (2017-05-28) https://doi.org/10.1101/142760

2. Sci-Hub provides access to nearly all scholarly literature
Daniel S Himmelstein, Ariel R Romero, Stephen R McLaughlin, Bastian Greshake Tzovaras, Casey S Greene
PeerJ Preprints (2017-07-20) https://doi.org/10.7287/peerj.preprints.3100

3. Vagelos Report Summer 2017
Michael Zietz
Figshare (2017) https://doi.org/10.6084/m9.figshare.5346577

Write markdown

Automatically converted to rich text

Automatic bibliographic metadata

[@doi:10.7287/peerj.preprints.3100]
[@arxiv:1407.3561v1]
[@pmid:24159271]
[@url:http://blog.dhimmel.com/biorxiv-licenses/]

2. Continuous integration rebuilds the manuscript

Timestamped on the Bitcoin blockchain via OpenTimestamps

3. Continuous deployment back to GitHub

Pull requests for manuscript collaboration

the future: living but versioned

“Finally, we estimate that over a six-month period in 2015–2016, Sci-Hub provided access for 99.3% of valid incoming requests.”

— DOI: 10.7287/peerj.preprints.3100v1

“In the first version of this study, we mistakenly treated the log events as requests rather than downloads. Fortunately, Sci-Hub reviewed the preprint in a series of tweets, and pointed out the error…”

— DOI: 10.7287/peerj.preprints.3100v2

The Deep Review

  • review article on deep learning in precision medicine
  • 27 authors from 20 different institutions
  • readers appreciate the breadth of perspectives

Science on the Hill: Pirate Website Unlocks Access to Scholarly Literature

By Daniel Himmelstein

Science on the Hill: Pirate Website Unlocks Access to Scholarly Literature

Slides for Science on the Hill, hosted by Saint Joseph's University on 2018-01-18. Released under a CC-BY 4.0 License.

  • 3,604