rminiconda
Clean Python management
when using R as an interface to Python
DSC 2019
Interfaces should be convenient
- 
	Programming a package to use an interface should be straightforward 
- 
	The users of the application should be largely unaware of the interface 
From John Chambers (Extending R)
Interfaces should be convenient
- 
	Programming a package to use an interface should be straightforward 
- 
	The users of the application should be largely unaware of the interface 
From John Chambers (Extending R)
reticulate accomplishes this very nicely
Except for one thing

166 / 429 issues (~39%)
Why is this a problem?
There are plenty of interfaces with system dependencies
- C, C++, Fortran
- Misc system requirements
- SQL databases, Spark, etc.
Why is this a problem?
There are plenty of interfaces with system dependencies
- C, C++, Fortran
- Misc system requirements
- SQL databases, Spark, etc.
Usually not a problem on Mac/Linux, CRAN binaries
Motivated user / installed by system admin
Often easy to install / configure (brew, apt, yum, etc.)
What's special about Python?
reticulate enables R to be not just an interface to Python, but many interfaces to many Python packages
What's special about Python?
- 
	Different versions of Python 
- 
	Many Python packages with different versions and dependencies 
- 
	Many environment management approaches 
- 
	Python environments are often already configured for other uses outside of your particular R interface use case 
reticulate enables R to be not just an interface to Python, but many interfaces to many Python packages
What's special about Python?
- 
	Different versions of Python 
- 
	Many Python packages with different versions and dependencies 
- 
	Many environment management approaches 
- 
	Python environments are often already configured for other uses outside of your particular R interface use case 
reticulate enables R to be not just an interface to Python, but many interfaces to many Python packages
It is inevitable that at some point a user will need to do something manually in their system outside of R to get their Python environment installed or configured properly
rminiconda
- 
	Install miniconda Python in an isolated, "namespaced" location that can be fully customized for your particular use case 
- 
	Provides utilities for making this installation and configuration part of an R package setup 
- 
	These installations do not interfere with any other Python installation on your system 
- 
	Works on Linux, MacOS, and Windows 
rminiconda
- 
	Install miniconda Python in an isolated, "namespaced" location that can be fully customized for your particular use case 
- 
	Provides utilities for making this installation and configuration part of an R package setup 
- 
	These installations do not interfere with any other Python installation on your system 
- 
	Works on Linux, MacOS, and Windows 
Goal: Give R users access to anything in Python without them knowing they are using Python
Why miniconda?
- Relatively small
- Self-contained install option
- Fast install
- Easy to install on major platforms
"Miniconda is a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages."
Another (messier) way: https://github.com/Sage-Bionetworks/PythonEmbedInR
Install
# install.packages("remotes")
remotes::install_github("hafen/rminiconda")Use case 1:
I'm a data scientist using reticulate and I want a "clean" separate Python installation
install_miniconda()
Places an isolated miniconda installation in a subdirectory of a base directory housing all installations made through rminiconda
| OS | Base Directory Location | 
|---|---|
| Windows | %APPDATA%\rminiconda | 
| Linux | ~/.rminiconda | 
| MacOS | ~/Library/rminiconda | 
Installing miniconda
rminiconda::install_miniconda(version = 2, name = "my_python")
# Using path for conda installation:
#   /Users/hafen/Library/rminiconda/my_python
# Downloading miniconda installer...
# Source: https://repo.anaconda.com/miniconda/Miniconda2-latest-MacOSX-x86_64.sh
# Destination: /Users/hafen/Library/rminiconda/my_python
# trying URL 'https://repo.anaconda.com/miniconda/Miniconda2-latest-MacOSX-x86_64.sh'
# Content type 'application/x-sh' length 44325091 bytes (42.3 MB)
# ==================================================
# downloaded 42.3 MB
#
# By installing, you accept the Conda license:
#   https://conda.io/en/latest/license.html
# Installing isolated miniconda distribution...
# ...
# ...
# miniconda installation successful!
Using miniconda
# get the path to the binary and set reticulate to use it
py <- rminiconda::find_miniconda_python("my_python")
reticulate::use_python(py, required = TRUE)
Use case 2:
I'm an R package developer and I want my package users to not worry about anything related to Python
Case study: kitools
- Utilities for data scientists working in the "knowledge integration" (ki) group at a large non-profit
- Mostly R users but need to support Python as well
- Complex logic -- don't want to maintain two independent codebases
- Build the package in Python and use reticulate to port it to R
- Our users won't use it if it's not easy to install
zzz.R
#' @import rminiconda
.onLoad <- function(libname, pkgname) 
  # side effects (bad!) but can lead to trouble if not unset
  Sys.setenv(PYTHONHOME = "")
  Sys.setenv(PYTHONPATH = "")
  # make sure Python is configured for this package
  is_configured(msg = packageStartupMessage)
}
#' Check to see if the kitools Python environment has been configured
#' @param msg What function to use for messages (could be called at package startup or elsewhere in the package)
is_configured <- function(msg = message) {
  # should also check that the required packages are installed
  if (!rminiconda::is_miniconda_installed("kitools")) {
    msg("It appears that kitools has not been configured...")
    msg("Run 'kitools_configure()' for a one-time setup.")
    return (FALSE)
  } else {
    py <- rminiconda::find_miniconda_python("kitools")
    reticulate::use_python(py, required = TRUE)
    return (TRUE)
  }
}kitools_configure()
#' One-time configuration of environment for kitools
#'
#' @details This installs an isolated Python distribution along with required dependencies so that the kitools R package can seamlessly wrap the kitools Python package.
#' @export
kitools_configure <- function() {
  # install isolated miniconda
  if (!rminiconda::is_miniconda_installed("kitools"))
    rminiconda::install_miniconda(version = 3, name = "kitools")
  # install python packages
  py <- rminiconda::find_miniconda_python("kitools")
  rminiconda::rminiconda_pip_install("beautifultable", "kitools")
  rminiconda::rminiconda_pip_install("synapseclient", "kitools")
  rminiconda::rminiconda_pip_install("kitools", "kitools",
    "-i https://test.pypi.org/simple/ kitools")
  reticulate::use_python(py, required = TRUE)
}User's Experience
> library(kitools)
# It appears that kitools has not been configured...
# Run 'kitools_configure()' for a one-time setup.
#
# Attaching package: ‘kitools’
>
> kitools_configure()
# Using path for conda installation:
#   /Users/hafen/Library/rminiconda/kitools
# Downloading miniconda installer...
# Source: https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh
# Destination: /Users/hafen/Library/rminiconda/kitools
# trying URL 'https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh'
# Content type 'application/x-sh' length 54554851 bytes (52.0 MB)
# ==================================================
# downloaded 52.0 MB
#
# By installing, you accept the Conda license:
#   https://conda.io/en/latest/license.html
# Installing isolated miniconda distribution...
# ...Considerations
- Disk space: a basic Miniconda install with a few additional packages is ~250Mb
- If developing many packages with a common theme, use and maintain the same rminiconda "namespace"
- Versions: support specific versions of Python? (Miniconda versions do not match Python versions)
- Other convenience functions?
Thoughts on reticulate interfaces
- Build wrapped R-natural interfaces on Python packages? Or import classes and use methods?
	- Class methods are a bit unnatural for R users
		- 
			obj$method(...) <-> method(obj, ...) 
 
- 
			
- How to document / discover obj$method()?
- Wrapping is probably always best
- A best practices document might help set standards for quality
 
- Class methods are a bit unnatural for R users
		
- A few notes on reticulate
	- Issues with interactive input in RStudio IDE
- Handling Python errors / print methods
 
Thank You
rminiconda
By Ryan Hafen
rminiconda
- 3,488
 
   
   
  