Python in RMarkdown

The reticulate package logo.

Using Python in RMarkdown

In order to write blog posts using Python code, I wanted to figure out a way to include Python code chunks in RMarkdowns. When you insert a code chunk in RMarkdown, you have the option of specifying the language of that chunk: the default is R, but you can also insert a Bash, SQL, Python, etc. code chunk.

When I attempted to insert a Python code chunk and import libraries, however, I kept getting the error:

Error in py_run_string_impl(code, local, convert) : ImportError: No module named sklearn.cluster

From running Python in Atom, I knew I had the sklearn.cluster module installed, so the problem must be in the connection between R and Python.

reticulate

The reticulate package in R (website here allows R to interact with Python. I installed the package from RStudio.

# install.packages("reticulate")
library(reticulate)

Changing Python versions

Installing reticulate still didn’t allow me to knit the RMarkdown with a Python code chunk, however. I followed the instructions in this post by Pablo Franco to check the Python version that reticulate was using:

py_discover_config()

I ended up with the following output:

python:         /usr/bin/python
libpython:      /System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/config/libpython2.7.dylib
pythonhome:     /System/Library/Frameworks/Python.framework/Versions/2.7:/System/Library/Frameworks/Python.framework/Versions/2.7
version:        2.7.10 (default, Aug 17 2018, 19:45:58)  [GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.0.42)]
numpy:          /System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/numpy
numpy_version:  1.8.0

I wanted to be running Python version 3.6, which was the version I had installed using Anaconda, so I needed to change the path.

Set-up chunk

I discovered that you can set the path to a different installation of Python by modifying the setup chunk at the start of the RMarkdown. According to the bookdown website, the default used is Python 2.

My default version of this set-up chunk looks like this:

{r setup, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE)

You can set the chunk option engine.path to specify the path to the engine interpreter and change it from the default Python 2.

Finding Python path

I now needed to find the actual path to Python that I wanted to use. I did this by opening up Python separately from RStudio (I used Atom for this) and running the following (I got the code for this from here):

import sys
for p in sys.path:
    print(p)
## /Applications/anaconda3/bin
## /Applications/anaconda3/lib/python37.zip
## /Applications/anaconda3/lib/python3.7
## /Applications/anaconda3/lib/python3.7/lib-dynload
## /Applications/anaconda3/lib/python3.7/site-packages
## /Applications/anaconda3/lib/python3.7/site-packages/aeosa
## /Library/Frameworks/R.framework/Versions/3.5/Resources/library/reticulate/python

From this information, I could tell I wanted to use the path /anaconda3/lib/python3.6, rather than /usr/bin/python, which is what RMarkdown had originally been using. I modified by set-up chunk to look like this:

{r setup, include=FALSE}
knitr::opts_chunk$set(collapse = TRUE, engine.path = list(python = '/anaconda3/bin/python3.6'))

Other options

This solution enabled me to knit RMarkdowns with Python code chunks! It changes the engine interpreter globally, which you could do for multiple engines simultaneously, like Python and Ruby, for example:

knitr::opts_chunk$set(engine.path = list(
  python = '/anaconda3/bin/python3.6',
  ruby = '/usr/local/bin/ruby'
))

Alternatively, you can specify the engine interpreter locally in each code chunk by starting the chunk with {python, engine.path = '/anaconda3/bin/python3.6}, for example.

Avatar
Cecina Babich Morrow
Compass PhD Program

My research interests range from harnessing data to improve mental healthcare to understanding global patterns of macroecology.

comments powered by Disqus

Related