Becoming a R librarian

Worth the effort?

sharing

no need to send / copy files

# If not already added permanently: add PIK-CRAN repository
options(repos = c(CRAN = "@CRAN@", pik = "http://rse.pik-potsdam.de/r/packages"))

# If not already installed: install package
install.packages("lucode")

# load package
library(lucode)

# run function
modelstats()

retrieve code within R

"can you send me that function that shows statistics about MAgPIE and REMIND runs?"

not sure how to use it?

# load help file
?modelstats

Analyze Model Statistics

Description:

     Shiny app to analyze statistics collected with 'runstatistics' and
     merged with 'mergestatistics'

Usage:

     modelstats(files = c("https://www.pik-potsdam.de/rd3mod/magpie.rds",
       "https://www.pik-potsdam.de/rd3mod/remind.rds"), resultsfolder = NULL)
     
Arguments:

   files: path to rds-files from which statistics should be read

resultsfolder: path to a folder containing model results of the
          corresponding runs

Author(s):

     Jan Philipp Dietrich

every function comes with a help file

need more information?

# list existing vignettes
vignette()

# want to know about the magclass concept?
vignette("magclass-concept")

# how to use magclass?
vignette("magclass")

# what is madrat?
vignette("madrat")

you can add a vignette to a package

let your work be cited

> citation("madrat")

Um Paket ‘madrat’ in Publikationen zu zitieren nutzen Sie bitte:

Dietrich J, Baumstark L and Giannousakis A (2018). _madrat: May All Data 
be Reproducible and Transparent (MADRaT)_. doi: 10.5281/zenodo.1115490 
(URL: http://doi.org/10.5281/zenodo.1115490), R package version 1.44.0,
<URL: https://github.com/pik-piam/madrat>.

A BibTeX entry for LaTeX users is

  @Manual{,
    title = {madrat: May All Data be Reproducible and Transparent (MADRaT)},
    author = {Jan Philipp Dietrich and Lavinia Baumstark and Anastasis Giannousakis},
    year = {2018},
    note = {R package version 1.44.0},
    doi = {10.5281/zenodo.1115490},
    url = {https://github.com/pik-piam/madrat},
  }

every package comes with citation information

quality

Basic package checks

Building madrat ---------------------------------------------------------------
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet  \
  CMD build '/home/jpd/Dokumente/PIK/libraries/git/madrat' --no-resave-data  \
  --no-manual 

* checking for file ‘/home/jpd/Dokumente/PIK/libraries/git/madrat/DESCRIPTION’ ... OK
* preparing ‘madrat’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* looking to see if a ‘data/datalist’ file should be added
* building ‘madrat_1.42.0.tar.gz’

Setting env vars --------------------------------------------------------------
_R_CHECK_CRAN_INCOMING_USE_ASPELL_: TRUE
_R_CHECK_CRAN_INCOMING_           : FALSE
_R_CHECK_FORCE_SUGGESTS_          : FALSE
Checking madrat ---------------------------------------------------------------
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet  \
  CMD check '/tmp/RtmpYroa4c/madrat_1.42.0.tar.gz' --as-cran --timings  \
  --no-manual 

* using log directory ‘/home/jpd/Dokumente/PIK/libraries/git/madrat.Rcheck’
* using R version 3.3.1 (2016-06-21)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using options ‘--no-manual --as-cran’
* checking for file ‘madrat/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘madrat’ version ‘1.42.0’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘madrat’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking R/sysdata.rda ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’ OK
* checking for unstated dependencies in vignettes ...
 OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* DONE
Status: OK



R CMD check results
0 errors | 0 warnings | 0 notes

R comes with a collection of checks for packages

Extended package checks

context("Data aggregation")

data("population_magpie")
w <- pm <- population_magpie
w[,,] <- NA
map  <- data.frame(from=getRegions(pm),to=rep(c("REG1","REG2"),5))
map2 <- data.frame(from=getRegions(pm),to=getRegions(pm))

test_that("Identity mapping is not changing the data", {
  expect_equivalent(toolAggregate(pm,map2),pm)
})

test_that("NA columns in weight are summed up", {
  expect_equivalent(toolAggregate(pm,map),toolAggregate(pm,map, weight=w, 
                                          mixed_aggregation=TRUE))
})

test_that("NA in weight leads to summation and other weight to weighting", {
  w[,,1] <- 1
  w[,,2] <- NA
  mix <- toolAggregate(pm,map, weight=w, mixed_aggregation=TRUE)
  mean <- toolAggregate(pm[,,1],map, weight=w[,,1])
  sum <- toolAggregate(pm[,,2],map)
  expect_equivalent(mix,mbind(mean,sum))
})

test_that("NAs in weight for mixed_aggregation=FALSE throw an error", {
  w[,,] <- NA
  expect_error(toolAggregate(pm,map, weight=w))
})

test_that("Random NAs in weight and mixed_aggregation=TRUE throw an error", {
  w[,,] <- 1
  w[3,1,1]  <- NA
  expect_error(toolAggregate(pm,map, weight=w, mixed_aggregation=TRUE))
})

test_that("partrel=TRUE works in combination with weights",{
  w[,,] <- NA
  map3 <- map[1:5,]
  expect_equivalent(toolAggregate(pm,map3,partrel=TRUE, verbosity=10),
                    toolAggregate(pm, map3, partrel = TRUE, weight=w[1:5,,], 
                    mixed_aggregation = TRUE, verbosity=10))
})

You can add own tests

x-eyes principle

additional testing through sharing

  • more user = less hidden bugs
  • other user might fix your bugs
  • other user might add features to your function

effort?

Write a function

#' head
#' 
#' head method for MAgPIE objects to extract the head of an
#' object
#' 
#' @aliases head.magpie
#' @param x MAgPIE object
#' @param n1,n2,n3 number of lines in first, second and third dimension that
#' should be returned. If the given number is higher than the length of the
#' dimension all entries in this dimension will be returned.
#' @param ... arguments to be passed to or from other methods.
#' @return head returns the first n1 x n2 x n3 entries
#' @author Jan Philipp Dietrich
#' @seealso \code{\link[utils]{head}}, \code{\link[utils]{tail}}
#' @examples
#' 
#'  data(population_magpie)
#'  head(population_magpie)
#' 
#' @importFrom utils head
#' @export 
head.magpie <- function(x, n1=3L, n2=6L, n3=2L, ...) {
  if(dim(x)[1]<n1) n1 <- dim(x)[1]
  if(dim(x)[2]<n2) n2 <- dim(x)[2]
  if(dim(x)[3]<n3) n3 <- dim(x)[3]
  return(x[1:n1,1:n2,1:n3])  
}

Add function, document, mention dependencies

Check package

> lucode::buildLibrary()
Updating madrat documentation
Loading madrat
Loading required package: magclass

Attaching package: ‘magclass’

The following object is masked from ‘package:grid’:

    getNames

Updating vignettes
Updating madrat documentation
Loading madrat
Setting env vars --------------------------------------------------------------------------------------------------------------
CFLAGS  : -Wall -pedantic
CXXFLAGS: -Wall -pedantic
Building madrat ---------------------------------------------------------------------------------------------------------------
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD build  \
  '/home/jpd/Dokumente/PIK/libraries/git/madrat' --no-resave-data --no-manual 

* checking for file ‘/home/jpd/Dokumente/PIK/libraries/git/madrat/DESCRIPTION’ ... OK
* preparing ‘madrat’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... OK
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* looking to see if a ‘data/datalist’ file should be added
* building ‘madrat_1.42.0.tar.gz’

Setting env vars --------------------------------------------------------------------------------------------------------------
_R_CHECK_CRAN_INCOMING_USE_ASPELL_: TRUE
_R_CHECK_CRAN_INCOMING_           : FALSE
_R_CHECK_FORCE_SUGGESTS_          : FALSE
Checking madrat ---------------------------------------------------------------------------------------------------------------
'/usr/lib/R/bin/R' --no-site-file --no-environ --no-save --no-restore --quiet CMD check  \
  '/tmp/RtmpYlkgbr/madrat_1.42.0.tar.gz' --as-cran --timings --no-manual 

* using log directory ‘/tmp/RtmpYlkgbr/madrat.Rcheck’
* using R version 3.3.1 (2016-06-21)
* using platform: x86_64-pc-linux-gnu (64-bit)
* using session charset: UTF-8
* using options ‘--no-manual --as-cran’
* checking for file ‘madrat/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘madrat’ version ‘1.42.0’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘madrat’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking R/sysdata.rda ... OK
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in ‘tests’ ... OK
* checking tests ...
  Running ‘testthat.R’
 OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* DONE

Status: OK

R CMD check results
0 errors | 0 warnings | 0 notes

Package check successful! Please choose an update type :
1: major revision (for major rewrite of the whole package)
2: minor revision (for new features or improvements)
3: patch (for bugfixes and corrections)
4: only for packages in development stage
0: no version increment (only to use if version is already incremented!)

Number: 
2
done

Fix notes, warnings and errors & commit your changes

Done

#retrieve updates of installed packages
update.packages()

Your update is now available to everyone!

what else?

Data class standard

# in the magclass universe all functions
# are compatible to each other

library(magclass)
a <- dimSums(population_magpie,dim = 1)

library(luplot)
scratch_plot(a)

Compatibility between functions increases collaboration

Could magclass be an option?

PIK-CRAN server

PIK-CRAN - R package repository

Service provided by Model Operations

To use this repo run options(repos = c(CRAN = "@CRAN@", pik = 
"http://rse.pik-potsdam.de/r/packages")) in R. The repository can be 
made permanently availably by adding the line above to .Rprofile in 
your home directory.

Global .Rprofile loaded!

Tue Apr 17 20:20:01 2018 
.:: ar5data      1.7.1      ok ::.
.:: faodata      1.09       ok ::.
.:: gdx          1.48.0     ok ::.
.:: gdxrrw       1.0.2      ok ::.
.:: geodata      1.59       ok ::.
.:: givemeall    0.02       ok ::.
.:: goxygen      0.6.2      ok ::.
.:: iamc         0.24.0     ok ::.
.:: limes        0.3.1 -> 0.3.16 invalid commit ::.
.:: lpjclass     1.13       ok ::.
.:: lubase       1.06       ok ::.
.:: lucode       2.121.0.9001 ok ::.
.:: ludata       1.43.3     ok ::.
.:: luplot       3.38.6     ok ::.
.:: luscale      2.12.0     ok ::.
.:: lusweave     1.45.0     ok ::.
.:: madrat       1.44.0     ok ::.
.:: magclass     4.83.1     ok ::.
.:: magpie       0.2266.1   ok ::.
.:: magpie4      1.13.4     ok ::.
.:: magpiesets   0.33.0     ok ::.
.:: mip          0.101.4    ok ::.
.:: moinput      9.55.1     ok ::.
.:: mrfood       0.7.3      ok ::.
.:: mrregression 3.6.0      ok ::.
.:: mrvalidation 1.24.0     ok ::.
.:: nitrogen     1.0.3      ok ::.
.:: piam         0.8.2      ok ::.
.:: quitte       0.3068.0   ok ::.
.:: remind       36.40.1    ok ::.
.:: trafficlight 1.11.1     ok ::.
.:: trefoil      0.01       ok ::.
.:: validation   1.195      ok ::.
done.

low barrier option to share code

Becoming a R librarian

By Jan Dietrich

Becoming a R librarian

  • 130