Write replicable papers using the R package “Knitr”

3 minute read

Published:

In this section I will describe my take on literate programming: not the best, not the only one. Just the one I use.

The main idea is to outsource your R code into LaTeX automatically. For example, this way you do not have to copy and paste standard errors, or figures. It will be automatically generated in your PDF file after compiling the .rnw.

install.packages(c("knitr", "formatR", "pacman"), repos = "http://cran.rstudio.com")
  • In Sublime Text 3, modify user settings in LaTeXing. First line adds knitr capabilities, while the second line allows BibTeX compilation.
{"knitr": true, /* activates knitr function */
"quick_build": [ /* activates latexing capabilities? Not sure. */
    {
        "name": "Primary Quick Build: latexmk",
        "primary": true,
        "cmds": ["latexmk"]
    }
                ],
    "keep_focus": true, /* activates skim sync and pop up function */
    "keep_focus_delay": 0.1, /* reduces skim sync time */
    "pdf_viewer_osx": { /* makes sure skim is the default app. User should make sure that the NAME OF the app is correct too (it might change with some updates) */
        "skim": [
            "/Applications/Skim.app"
        ],
        "preview": [
            "/Applications/Preview.app"
        ]
    }

}
  • Writing R and LaTeX code together:

1) In your .rnw file, just generate an environment that includes a tag that’s shared with the R file, like so:

<<chunk:example, echo = FALSE, cache = FALSE>>=

@ 

In this example, the tag is chunk:example. Also, editors and (most) readers are not interested in your coding. That’s fine. The important thing is to make your code available to the audience that needs it. You can hide it by calling the echo = FALSE function in the preamble, i.e. within the <<>>= symbols. Also, the cache = FALSE makes sure that your computer does not store whatever you command it to do (graphs, data, etc.). I think it’s important to set it in FALSE so fresh plots, tables, and datasets are generated every time. This way you are absolutely sure that whatever is in your PDF reflects the latest compile.

2) In your R script, have another environment where you actually host the code, like so:

## ---- chunk:example
set.seed(602)
library(MASS) # to generate multi-var distribution
e <-  as.numeric(mvrnorm(n = 100, mu = 0, Sigma = 1))
plot(e)
##

Thus, every time you compile your .rnw file, R will execute the chunk:example chunk, and insert whatever is there (a table, a plot, a number, etc.).

Some people prefer having the R code within the .rnw file. My R code is always very long, which distracts me from writing. Thus, I prefer to have my R code separated from the .rnw file. It’s easily shareable, and more convenient.

  • Downloadable examples: rnw file and r file. Copy and paste these contents and make sure the file extension is .rnw and .r.