Advanced 

programming

Lecture 3

Måns Magnusson

Statistics and Machine learning

Department of computer and information science

Since last time?

Best practices

for scientific computing

based on the article

1. Write code for people

Advanced R Programming

Måns Magnusson

1.1 A program should not require its readers to hold more than a handful of facts in memory at once

1.2 Make names consistent, distinctive, and meaningful

1.3 Make code style and formatting consistent

2. Let the computer do the work

Advanced R Programming

Måns Magnusson

2.1 Make the computer repeat tasks

2.2 Save recent commands in a file for re-use

2.3 Use a build tool to automate workflows

3. Make incremental changes

Advanced R Programming

Måns Magnusson

3.1 Work in small steps with frequent feedback and course correction

3.2 Use a version control system

3.3 Put everything that has been created manually in version control

4. Don’t repeat yourself (or others).

Advanced R Programming

Måns Magnusson

4.1 Every piece of data must have a single authoritative representation in the system

4.2 Modularize code rather than copying and pasting

4.3 Re-use code instead of rewriting it 

5.  Plan for mistakes

Advanced R Programming

Måns Magnusson

5.1 Add assertions to programs to check their operation

5.2 Use an off-the-shelf unit testing library

5.3 Turn bugs into test cases

5.4 Use a symbolic debugger

6.Optimize software only after it works correctly

Advanced R Programming

Måns Magnusson

6.1 Use a profiler to identify bottlenecks

6.2 Write code in the highest-level language possible

7. Document design and purpose, not mechanics

Advanced R Programming

Måns Magnusson

7.1 Document interfaces and reasons, not implementations

7.2 Refactor code in preference to explaining how it works

7.3 Embed the documentation for a piece of software in that software 

8. Collaborate

Advanced R Programming

Måns Magnusson

8.1 Use pre-merge code reviews

8.2 Use pair programming when bringing someone new up to speed and when tackling particularly tricky problems

8.3 Use an issue tracking tool

R packages

R packages

Advanced R Programming

Måns Magnusson

An environment with functions and/or data

 

The way to share code and data

 

~4 000 developers

> 7000 packages

Package basics

Advanced R Programming

Måns Magnusson

Installation

Usage

library()
::
:::
install.packages()
devtools::install_github()
devtools::install_local()

Semantic versioning

Advanced R Programming

Måns Magnusson

"Dependency hell"

[MAJOR].[MINOR].[PATCH]

Package namespace

Advanced R Programming

Måns Magnusson

Examine packages

Advanced R Programming

Måns Magnusson

1. Who?

 

2. When updated?


3. In development?

Git and GitHub

Version control?

Advanced R Programming

Måns Magnusson

Why version control?

Advanced R Programming

Måns Magnusson

  1. Collaboration
  2. Storing versions (properly)
  3. Restoring versions
  4. Understanding what happens
  5. Backup

Why git?

Advanced R Programming

Måns Magnusson

  1. Simple to use
  2. Distributed
  3. Fast
  4. Common in practice
  5. R packages uses github
  6. Integrated with R-Studio

Basic git

Advanced R Programming

Måns Magnusson

github

Advanced R Programming

Måns Magnusson

  1. Remote (push/pull)
  2. Barebone homepage (using md)
  3. Collaborations
  4. Issue tracker / Wiki / discussions

Free for public repos

Private repos cost

Student accounts

Creating R packages

Why part of the course?

Advanced R Programming

Måns Magnusson

Writing performant code (best practice)

The way to collaborate (R ecosystem)

Combine code, data and analysis

Easy to distribute and reuse (public api)

 

Learn how to reuse code from other packages

Package structure

Advanced R Programming

Måns Magnusson

R/
man/
vignette/
tests/
data/
scr/
inst/
DESCRIPTION
NAMESPACE

Documentation with ROxygen

Why roxygen2?

Advanced R Programming

Måns Magnusson

  1. Performant code (docs close to code)
  2. Automatically generates all man files
  3. Simple to use
  4. Handles NAMESPACE
  5. Similar to JavaDoc and DOxygen

roxygen2 syntax

Advanced R Programming

Måns Magnusson

[example]

 

Full support in R-Studio

Unit testing with testthat

Why unit testing?

Advanced R Programming

Måns Magnusson

Fewer bugs

Better code structure

Faster restarts

Robust code - correct a bug only once

 

A must in complicated projects!

Types of testing

Advanced R Programming

Måns Magnusson

  1. White box testing
  2. Black box testing
  3. Probabilistic testing

testthat

Advanced R Programming

Måns Magnusson

Unit testing in R

 

Integrated with R-Studio

 

[example]

R-Studio debugger

Advanced R Programming

Måns Magnusson

Advanced R - Lecture 3

By monsmagn

Advanced R - Lecture 3

Lecture 3 in the course Advanced R programming at Linköping University.

  • 1,324