Daniel Himmelstein
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.
Science on the Hill
Saint Joseph's University
Landmark Americana City Line
6:00 PM Thursday, January 18, 2018
Online at slides.com/dhimmel/sju
http://www.greenelab.com/
Dr. Daniel Himmelstein, a data scientist currently working out of the University of Pennsylvania's Greene Lab, will be sharing his story about breaking down the "toll access" publication model, which prevents the public from accessing academic articles. Using computer and data science, Dr. Himmelstein will show you how websites like Sci-Hub are already making primary science articles easier to access than ever before! Join us Thursday January 18th 6:00-8:00PM at Landmark Americana for some delicious food, inspiring science, and post-holiday merriment!
Background: Most academic research is funded by the public or private philanthropy. Nonetheless, even in 2017, the majority of new scholarly articles are paywalled. However, a controversial alternative to purchasing article access is emerging. The website Sci-Hub provides access to scholarly literature via fulltext PDF downloads. The site enables users to access articles that would otherwise be paywalled. However, Sci-Hub's hosting of articles is often copyright infringement, and two publishers have already won suits against Sci-Hub.
In March, Sci-Hub tweeted the identifiers (DOIs) for all articles in their repository. By integrating this dataset with a catalog of scholarly literature, we assessed Sci-Hub's coverage and found that Sci-Hub contains 86% of articles in toll-access journals. This number rises to 96% for recently-cited articles.
We suggest the ubiquity of Sci-Hub will disrupt scholarly publishing. Specifically, toll access publishing will no longer be a viable business model. We provide evidence that the transition is already underway and urge scholars to adopt libre open access as an alternative. This study was performed openly online at https://github.com/greenelab/scihub and can be read at https://doi.org/b9s5.
Himmelstein DS, Romero AR, McLaughlin SR, Greshake Tzovaras B, Greene CS. (2017) Sci-Hub provides access to nearly all scholarly literature. PeerJ Preprints DOI: 10.7287/peerj.preprints.3100
Sci-Hub is available at:
Sci-Hub is available at:
🔒
Image 3850
The New York Times:
Should All Research Papers Be Free?
Alexandra Elbakyan
https://doi.org/bf37
Representative work #28
Idiogramma elbakyanae
Metadata for porn from the Entertainment Identifier Registry
Study at https://doi.org/b9s5
49% of 2.8 million articles
85% of 54 million articles
Currently, the Sci-Hub does not store books, for books users are redirected to LibGen, but not for research papers. In future, I also want to expand the Sci-Hub repository and add books too.
Data from "The State of OA" Study https://doi.org/gbqtxd
Data from "The State of OA" Study https://doi.org/gbqtxd
https://github.com/greenelab/library-access
Jacob Levernier
Monthly Bitcoin Donations
As of December 31, 2017:
While this study had a number of interesting aspects, its virtual lack of success as a tool for reducing the library's journal budget was largely due to the fact that the overall problem was seen by everyone concerned as a library problem. As such, the only solution available to the library in 1981 was to use monograph and binding funds to help offset the shortfall in the serials and journals budget. While the biology and chemistry libraries were spared drastic cuts because of very generous support from divisional funds, Caltech's engineering libraries were extremely hard hit, and only now after nearly seven years have they recovered (just in time for the current crisis). It should be pointed out here that from 1974 to 1983 the materials budgets for the departmental libraries were the responsibility of appropriate divisions.
Dana Roth (1990) "The Serials Crisis Revisited"
The Serials Librarian. https://doi.org/dvwb7f
Dana Roth (1990) "The Serials Crisis Revisited"
The Serials Librarian. https://doi.org/dvwb7f
Source: Association of Research Libraries. Expenditure Trends in ARL Libraries, 1986–2015
Headlines:
https://doi.org/b9s5
What library will continue to subscribe if a growing proportion of articles is available for free elsewhere?
—Tom Reller (2013) Vice President, Elsevier
Defendants’ actions also threaten imminent irreparable harm to Elsevier because it appears that the Library Genesis Project repository may be approaching (or will eventually approach) a level of “completeness” where it can serve as a functionally equivalent, although patently illegal, replacement for ScienceDirect.
—DeMarco, Hirschberg & Sen (2015) Attorneys for Elsevier
Courchamp & Bradshaw (2017) Nature Ecology & Evolution https://doi.org/cf8f
https://greenelab.github.io/scihub-manuscript
powering the next generation of scholarly manuscript
Get started at tiny.cc/manubot
https://github.com/greenelab/manubot-rootstock
The Manubot project began with the [Deep Review](https://github.com/greenelab/deep-review),
where it was used to compose a highly-collaborative review article [@doi:10.1101/142760].
Other manuscripts that were created with Manubot include:
+ The Sci-Hub Coverage Study
([GitHub](https://github.com/greenelab/scihub-manuscript), [HTML manuscript](https://greenelab.github.io/scihub-manuscript/))
[@doi:10.7287/peerj.preprints.3100]
+ Michael Zietz's Report for the Vagelos Scholars Program
([GitHub](https://github.com/zietzm/Vagelos2017), [HTML manuscript](https://zietzm.github.io/Vagelos2017/))
[@doi:10.6084/m9.figshare.5346577]
The Manubot project began with the Deep Review, where it was used to compose a highly-collaborative review article [1]. Other manuscripts that were created with Manubot include:
1. Opportunities And Obstacles For Deep Learning In Biology And Medicine
Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones, Alexandr A. Kalinin, Brian T. Do, Gregory P. Way, Enrico Ferrero, Paul-Michael Agapow, Wei Xie, Gail L. Rosen, … Casey S. Greene
Cold Spring Harbor Laboratory (2017-05-28) https://doi.org/10.1101/142760
2. Sci-Hub provides access to nearly all scholarly literature
Daniel S Himmelstein, Ariel R Romero, Stephen R McLaughlin, Bastian Greshake Tzovaras, Casey S Greene
PeerJ Preprints (2017-07-20) https://doi.org/10.7287/peerj.preprints.3100
3. Vagelos Report Summer 2017
Michael Zietz
Figshare (2017) https://doi.org/10.6084/m9.figshare.5346577
Write markdown
Automatically converted to rich text
Automatic bibliographic metadata
[@doi:10.7287/peerj.preprints.3100]
[@arxiv:1407.3561v1]
[@pmid:24159271]
[@url:http://blog.dhimmel.com/biorxiv-licenses/]
2. Continuous integration rebuilds the manuscript
Timestamped on the Bitcoin blockchain via OpenTimestamps
3. Continuous deployment back to GitHub
Pull requests for manuscript collaboration
“Finally, we estimate that over a six-month period in 2015–2016, Sci-Hub provided access for 99.3% of valid incoming requests.”
— DOI: 10.7287/peerj.preprints.3100v1
“In the first version of this study, we mistakenly treated the log events as requests rather than downloads. Fortunately, Sci-Hub reviewed the preprint in a series of tweets, and pointed out the error…”
— DOI: 10.7287/peerj.preprints.3100v2
The Deep Review
By Daniel Himmelstein
Slides for Science on the Hill, hosted by Saint Joseph's University on 2018-01-18. Released under a CC-BY 4.0 License.
Head of Data Integration at Related Sciences. Digital craftsman of the biodata revolution.