Edit this page on our live server and create a PR by running command !create-pr in the console panel

Converting RMarkdown files to SoS notebooks

  • Difficulty level: easy
  • Time need to lean: 10 minutes or less
  • Key points:
    • sos convert file.Rmd file.ipynb converts a Rmarkdown file to SoS notebook. A markdown kernel is used to render markdown text with in-line expressions.
    • sos convert file.Rmd file.ipynb --execute executes the resulting SoS notebook
    • sos convert file.Rmd file.html --execute --template converts a R markdown file to SoS notebook, executes it, and converts the resulting notebook to HTML format

The RMarkdown format is a markdown format with embedded R expressions and code blocks, and is extremely popular for R users. SoS Notebook provides an utility to convert Rmarkdown files to a SoS Notebook with command

sos convert input.Rmd output.ipynb

with the option to execute the resulting notebook

sos convert input.Rmd output.ipynb --execute

Example files and commands:

Converting R Markdown to SoS Notebook

Although there are already a number of Rmd to Jupyter converters (e.g. notedown, RMD-to-Jupyter (uses rpy2)), they lack support for some of the Rmakdown features due to limitations of the Jupyter notebook platform. Fortunately, SoS Notebook, especially its Jupyter Lab extension addresses most of the limitations and offers an almost perfect conversion from R markdown to Jupyter notebook.

The first Rmarkdown feature that is difficult to convert is its inline expressions, which are R expressions embedded in markdown texts. Jupyter cannot handle embedded expressions in its markdown cells because markdown cells are handled in its frontend and does not interact with the computing kernel. SoS Notebook addresses this problem with the use of a markdown kernel, which is essentially a markdown kernel

For example, the following Rmarkdown text

I counted `r sum(c(1,2,3))` blue cars on the highway.

is converted to a markdown cell that is evaluated in a R kernel as follows

In [1]:

I counted 6 blue cars on the highway.

The second Rmarkdown feature is its support for multiple languages, which allows it to have code blocks in a number of langauges. A Jupyter notebook with an ir kernel can only evaluate R scripts, but a SoS Notebook is able to include multiple kernels in one notebook.

For example, code blocks such as

{python}
def f(x):
  return x + 2
f(2)

and

{r
def f(x):
  return x + 2
f(2)

are converted to cells with approprivate kernels such as

In [2]:
Out[2]:
4

The last feature that is not properly supported are options such as echo=FALSE and include=FALSE for Rmarkdown code blocks. There were no corresponding features for classic Jupyter Notebook but Jupyter Lab supports hiding of input and/or output of cells. Using these features, code blocks such as the following are converted as collapsed input and/or outputs,

{r
arr <- rnorm(5)
cat(arr)
In [3]:
-2.237341 0.1291919 1.126049 0.006253894 0.204086

A related problem is that jupyter nbconvert does not respect the collasping status of cells and renders input and output of all cells. SoS Notebook addresses this problem by providing templates that honor the show/hide status of cells. For example, template sos-report-toc-v2 outputs all cells but hides collapsed inputs and outputs by default. The hidden content could be displayed by selecting a dropdown box to the top right corner of the document.

Option --execute

Rmarkdown files do not contain outputs from inline expressions and code blocks so output.ipynb generated from command

sos convert input.Rmd output.ipynb

only contains inputs. To obtain a notebook with embedded output, you can add option --execute to the convert command

sos convert input.Rmd output.ipynb --execute

This command will convert input.Rmd to a SoS notebook, executes it to generate the resulting output.ipynb. It is basically a shortcut for commands

sos convert input.Rmd tmp_output.ipynb
papermill --engine sos temp_output.ipynb output.ipynb
rm -f temp_output.ipynb

Generate a HTML report from a Rmarkdown file

Command

sos convert input.Rmd output.html --execute

convert file.Rmd to a SoS notebook, executes it, and generates a HTML report using specified template. It is basically a shortcut for commands

sos convert input.Rmd temp_output.ipynb
papermill --engine sos temp_output.ipynb temp_executed.ipynb
sos convert temp_executed.ipynb output.html
rm -rf temp_output.ipynb temp_executed.ipynb

Note that SoS provides a few templates to generate reports that hides input and/or outputs of code blocks, corresponding to echo=FALSE, include=FALSE options of Rmd code blocks. You can specify the use of templates with options such as --template sos-report-toc-v2. You can see a list of templates provided by SoS here.