Meta RMarkdown - Taxonomy and Use cases

rmarkdown tidyverse

A meta collection of all things R Markdown.

Thomas Mock
07-25-2020
Set of tools layed flat

NYR Presentation

My slides on this topic for the NYC R Conference are at bit.ly/marvelRMD.

PDF Version for folks who want to try out the code-chunks.

How Alison Hill teaches R Markdown

If you haven’t read it already make sure to read Dr. Alison Hill’s fantastic blogpost:
How I teach R Markdown

Alison is a RMarkdown superstar on the RStudio Education team. Her blogpost covers her guide on her well-informed approach for teaching R Markdown.

She has taught:
- College students as a professor across a semester
- In person professional learners at RStudio::conf in 1-2 day workshops
- Digital Learners in Pharma/Finance/etc via shorter online workshops

To summarize her post:

  1. Make it. Make it again. - Show how knitting works throughout the process.
  2. Make it pretty - Engage your learners with visuals, tables, etc - motivation is key!
  3. Make it snappy - Get a shareable link the first 20 min (usually via Netlify Drop).
  4. Make it real - “Teach folks what they need to know to actually use the tool productively in real life.”
  5. Make it easy - “People will only keep using R Markdown if they see it making their life easier. So show them how. For example, the RStudio IDE has some very nice built-in features that make it much easier to be an R Markdown user.”

Again - GO READ her blogpost for additional links and guides she links to.

My blogpost below is meant to be a sister article to hers, framed with a similar approach we use in Customer Success but different in that we’re not doing as much long-form education. Alison’s approach is well-informed and very useful in the context of direct teaching activity, which is why I wanted to share it as well!

How I share knowledge around R Markdown

I work on a different team than Alison at RStudio, specifically I’m a Customer Success Manager. This means that I work with existing RStudio Pro Product customers, most often people who have RStudio Connect. I work exclusively with High Tech/Software customers, meaning that they are typically already doing very sophisticated work with R in production, and I’m helping them further eliminate friction or empower their data science teams to do more with R.

A core part of my job is knowledge sharing around how to use open-source software like R Markdown with or without our Pro Products. Thus most of my work is Strategic in nature, although I do often give shorter 30-60 min training sessions that are Tactical.

A strategy is a set of guidelines used to achieve an overall objective, whereas tactics are the specific actions aimed at adhering to those guidelines. Source: Wikipedia

Thus my usual framing is covering topics that inform the learner of new strategies (ways of solving a problem) without necessary having to teach all the tactics (nuts and bolts of how it all works).

This post will focus on 4 core strategies of why R Markdown is SO useful and absolutely worth learning with links to external tactics/guides/write-ups of how to accomplish the various tasks.


Cat reading a military strategy book

R Markdown for Literate Programming

Goal: Capture code, text/comments, and output in a single document

This is the most common use of R Markdown, and is often how it is taught in University coursework. R Markdown is a tool for Literate Programming, and in summary is:

A programming paradigm introduced by Donald Knuth in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which compilable source code can be generated.

Not just for R

R Markdown obviously has rich support for R-based code and data products, but did you know it also supports:
- Native Python or calling Python from R via reticulate
- SQL - Blog post by Irene Steves
- CSS or JavaScript for all sorts of customization
- As well as Bash, Rcpp, Stan, and other formats
- All together there are 52(!) possible language engines coming from knitr

MVP of Reproducibility

Whether you talk about Minimum Viable Product or Most Valuable Player, it works! Since R Markdown is a form of Literate Programming, you can write all of your comments, notes, and execute your code within it.

Exploratory Data Analysis

An example here is for Dave Robinson’s #TidyTuesday screencasts + code


Man plotting an upward curve

R Markdown as a Data Product

Goal: Generate output natively in R for consumption

This is typically the second most common use of R Markdown. Since R Markdown can knit to all sorts of different formats, it is a powerful tool for creating data products like:

Presentations

Dashboards with flexashboard

Reports

Entire Websites

Most importantly these formats are created with code, so you get the benefit of reproducibility, automation, etc while still generating data products in the format your non-coder colleagues expect.


Child operating mission control

R Markdown as a Control Document

Goal: Scale data science tasks, automate the boring stuff, create robust pipelines

Less widely known, but just as important is the idea of R Markdown as a meta-document that lets you bring in other code or automate processes.

As it’s much larger in scope than a single bullet point I’d recommend going to read Emily Riederer’s blog post on Rmarkdown Driven Development. It’s “an approach of using R Markdown within the larger scope of the analysis engineering concept” presented by Hilary Parker.

A brief summary of her blogpost:

I tend to think of each RMarkdown as having a “data product” (an analytical engine calibrated to answer some specific question) nestled within it. Surfacing this tool just requires a touch of forethought before beginning an analysis and a bit of clean-up afterwards.

In this post, I describe RMarkdown Driven Development: a progression of stages between a single ad-hoc RMarkdown script and more advanced and reusable data products like R projects and packages. This approach has numerous benefits.

Automation w/ parameters

Child Documents

RMarkdown for Emails w/ blastula

R Markdown + RStudio Connect as an execution engine


Gif of a machine creating jelly rolls

RMarkdown for Templating

Goal: Don’t repeat yourself, generate input templates or output documents from code.

Using R Markdown for templating is normally thought of for knitr::render() + parameters, but there’s additional techniques to solve specific problems that don’t fit neatly into paramaterized reports as well.

Knitting w/ knit::render()

Looping outputs

Minimal example below with the palmerpenguins dataset. Full copy-pastable code at: https://git.io/JJBcC.

Note that I’m writing one function and calling it n times, it would loop across all the data based on the different inputs.


---
output: html_document
---
  
  
```{r penguin function, echo=FALSE, message=FALSE}
library(dplyr)
library(ggplot2)
library(palmerpenguins)
library(glue)
penguins <- palmerpenguins::penguins
multiplot <- function(penguin_name){
  cat(glue::glue("  \n### {penguin_name}  \n  \n"))
  df_pen <- penguins %>% 
    filter(as.character(species) == penguin_name) %>% 
    na.omit()
  
  flipper_len <- df_pen %>% 
    summarize(mean = mean(flipper_length_mm)) %>% 
    pull(mean) %>% 
    round(digits = 1)
  
  bill_len <- df_pen %>% 
    summarize(mean = mean(bill_length_mm)) %>% 
    pull(mean) %>% 
    round(digits = 1)
  
  cat(
    glue::glue("There are {nrow(df_pen)} observations of {penguin_name} penguins. The average flipper length is {flipper_len} and the average bill length is {bill_len}.  \n")
  )
  
  plot_out <- df_pen %>% 
    ggplot(aes(x = bill_length_mm, y = flipper_length_mm)) +
    geom_point() +
    labs(x = "Bill Length", y = "Flipper Length", title = penguin_name)
  
  print(plot_out)
  
  cat("  \n  \n")
}
```

```{r loop output,fig.width=6,echo=FALSE,message=FALSE,results="asis"}
purrr::walk(unique(as.character(penguins$species)), multiplot)
```

<!-- https://git.io/JJBcC -->

Which generates the following document:

Loop Output in RMD

whisker

Minimal whisker example below:

First, some input data:


data <- list(
  name = "Chris", 
  value = 10000, 
  taxed_value = 10000 - (10000 * 0.4), 
  in_ca = TRUE
)

Then a template:


template <-
'Hello {{name}}
You have just won ${{value}}!
{{#in_ca}}
Well, ${{taxed_value}}, after taxes.
{{/in_ca}}'

Now, fill the template!


text <- whisker.render(template, data)
cat(text)

# Output
Hello Chris
You have just won $10000!
Well, $6000, after taxes.

I use whisker natively to generate the readme files for each week’s #TidyTuesday submission. Separate blog-post to come for that!

usethis::use_template()

Sharla Gelfand, the “Queen of Reproducible Reporting”, put together lots of material using the usethis::use_template() workflow in their work.

Fin

So that’s an overview of my approach to sharing knowledge around R Markdown, and like Alison said:

But remember: there is no one way to learn R Markdown, and no one way to teach it either. I love seeing the creativity of the community when introducing the R Markdown family - so keep them coming!

Citation

For attribution, please cite this work as

Mock (2020, July 25). The Mockup Blog: Meta RMarkdown - Taxonomy and Use cases. Retrieved from https://themockup.blog/posts/2020-07-25-meta-rmarkdown/

BibTeX citation

@misc{mock2020meta,
  author = {Mock, Thomas},
  title = {The Mockup Blog: Meta RMarkdown - Taxonomy and Use cases},
  url = {https://themockup.blog/posts/2020-07-25-meta-rmarkdown/},
  year = {2020}
}