Electronic Lab Notebooks:
Recording Computational Work

Daniel Himmelstein (@dhimmel)

Biomedical Graduate Studies Orientation

University of Pennsylvania

BRB Auditorium

August 22, 2019 at 2:00 PM


slides released under CC BY 4.0

Greene Lab



  • files (especially text files) and directories

  • computational notebooks (e.g. Jupyter or R Markdown)

  • version control

  • code review on gitlab / github

  • open source

  • manubot


files & directories

ISO 8601



how to name files?


100% recordable

Why is computational research unique?


  • restart & run all

  • single script to run entire pipeline


version control

git log \
  --pretty=short \

code review

versioned environments

dhimmel/elevcan: repository for "Lung cancer incidence decreases with elevation: evidence for oxygen as an inhaled carcinogen"

Error due to glmnet 2.0-2 versus 1.9-5

conquer your environment


Control packages

Control OS + packages

open source

convert rms-fsf-slide-propreitary.png -channel RGB -negate -transparent black rms-fsf-slide-propreitary-negated.png

FreeSoftware TEDx slides. (2014) Reused under CC BY 3.0

proprietary software:
the software controls the science

FreeSoftware TEDx slides. (2014) Reused under CC BY 3.0

convert rms-fsf-slide.png -channel RGB -negate -transparent black rms-fsf-slide-negated.png

open source software:
the scientist controls the software

by default, scientific outputs subject to copyright

sometimes universities place additional legal barriers to reuse 


  1. release data under an open license
  2. University researchers: commit to open in your resource sharing plan


Beyond the PDF First Day Notes

By De Jongens van de Tekeningen

Licensed under CC BY 3.0

Modified to invert colors

citation by persistent identifier

This is a sentence with 5 citations [


  1. Reproducibility of computational workflows is automated using continuous analysis
    Brett K Beaulieu-Jones, Casey S Greene
    Nature Biotechnology (2017-03-13) https://doi.org/f9ttx6
    DOI: 10.1038/nbt.3780 · PMID: 28288103 · PMCID: PMC6103790
  2. Sci-Hub provides access to nearly all scholarly literature.
    Daniel S Himmelstein, Ariel Rodriguez Romero, Jacob G Levernier, Thomas Anthony Munro, Stephen Reid McLaughlin, Bastian Greshake Tzovaras, Casey S Greene
    eLife (2018-03-01) https://www.ncbi.nlm.nih.gov/pubmed/29424689
    DOI: 10.7554/elife.32822 · PMID: 29424689 · PMCID: PMC5832410
  3. Opportunities and obstacles for deep learning in biology and medicine
    Travers Ching, Daniel S. Himmelstein, Brett K. Beaulieu-Jones, Alexandr A. Kalinin, Brian T. Do, Gregory P. Way, Enrico Ferrero, Paul-Michael Agapow, Michael Zietz, Michael M. Hoffman, … Casey S. Greene
    Journal of the Royal Society Interface (2018-04) https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5938574/
    DOI: 10.1098/rsif.2017.0387 · PMID: 29618526 · PMCID: PMC5938574
  4. IPFS - Content Addressed, Versioned, P2P File System
    Juan Benet
    arXiv (2014-07-14) https://arxiv.org/abs/1407.3561v1
  5. Open collaborative writing with Manubot
    Daniel S. Himmelstein, David R. Slochower, Venkat S. Malladi, Casey S. Greene, Anthony Gitter
    (2018-08-03) https://greenelab.github.io/meta-review/
This is a sentence with 5 citations [1,2,3,4,5].






BGS Orientation: Electronic Lab Notebooks: special considerations for computational research

By Daniel Himmelstein

BGS Orientation: Electronic Lab Notebooks: special considerations for computational research

Presentation by Daniel Himmelstein on 2019-08-22 as part of the electronic lab notebook orientation module for the Biomedical Graduate Studies program at University of Pennsylvania. This presentation is released under a CC BY 4.0 License.

  • 2,035