8 min read

Styling R Code for Better Development


maze
Photo by Rayson Tan on Unsplash

View raw source for this post

Summary

As a script grows in both length and complexity, it can become a maze unless there are well-marked paths to aid understanding. Consistent styling is crucial to being able to debug problematic code and can help one find the way. This post offers some tips.

Table of Contents

Overview

As so eloquently stated in the tidyverse style guide, “Good coding style is like correct punctuation: you can manage without it, butitsuremakesthingseasiertoread.” This post will (1) explain the consensus around styling R code, (2) teach how to use the styler and grkstyle packages, (4) show how to create a keyboard shortcut, and (5) develop a GitHub workflow for styling R code.

Background

Inconsistent styling can result in confusion and delayed comprehension. It may further limit a program’s reuse, maintainability, and open-source collaboration.[1] At least two packages are popular in styling R code. This post discusses the styler package. (Another popular package is the formatR package by Yihui Xie and its website can be found here).

There are at least three different style guides for programming in R: Google’s R Style Guide, the Tidyverse Style Guide, and Bioconductor Coding Style Guide.[1]

In 2019, three researchers released a paper that examined how R code was styled. Their analysis included 20 years of CRAN packages and over 94 million lines of code. They made the following consensus-based recommendations:

  • Use lower Camel (lowerCamel) or snake case (snake_case)
  • Use <- to assign, don’t use =
  • Add a space after commas
  • Use TRUE/FALSE, don’t use T/F
  • Put open curly bracket on the same line then a newline
  • Use double quotation marks for strings
  • Add spaces around infix operators (i.e. =, +, -, <-)
  • Don’t terminate lines with a semicolon
  • Don’t explicitly type integers (i.e. 1L)
  • Put closed curly bracket on a separate line
  • Don’t use the tab to indent [1]

styler

The styler package in R is designed to style R code. [2]. You can start by installing the package and loading it into the namespace.

#install package
install.packages("styler")
library(styler)

The package overview details its main functions: style_pkg(), style_file() and style_text(). These functions are added to the “Addins” menu in Rstudio. More information at vignette(package = "styler").

Unstyled

#example - spaces missing
readr::read_csv('./my_table.csv',col_names=c('a','b'))

Styled

#example
style_text("readr::read_csv('./my_table.csv',col_names=c('a','b'))")
readr::read_csv("./my_table.csv", col_names = c("a", "b"))

The tidyverse style guide is a sensible default for writing R code.

# example tidyverse style
styler::style_text("call( 1)", style = tidyverse_style, scope = "spaces")
call(1)

Indentation

Space indentation is two spaces for tidyverse style. Indentation is related to line length. After all, a line that has multiple indentations leaves less space for the actual code. Most style guides and linters, like lintr, recommend a maximum line length of 80 characters.[1] Indentation and a line length margin can be set in Rstudio.

For identation, go to Rstudio > Preferences > Code > Editing.

Indentation set to 6 spaces.  Note that the "insert spaces for tab" is checked as recommended by the study above.

Figure 1: Indentation set to 6 spaces. Note that the “insert spaces for tab” is checked as recommended by the study above.

For line-length settings, go to Rstudio > Preferences > Code > Display.

Margin setting at 80.  Indentation guides can also be a helpful option.

Figure 2: Margin setting at 80. Indentation guides can also be a helpful option.

In production, a two-space indentation can be problematic when writing deeply nested code. The small contrast can make it difficult for a programmer to see code flow and matching brackets/parenthesis. Indentation is a personal preference and should be set to maximize productivity. (My preference is an indentation of 6 spaces in production and then for it to be restyled consistent with tidyverse style guide of 2 when pushed to GitHub).

Indentation within the styleR package is via the command:

library(styler)
string_to_format <-
"dataset |> dplyr::group_by(some_variable) |> dplyr::summarise(
    mean(mean_some_variable=mean(some_variable)))"
styler::style_text(string_to_format, style = tidyverse_style, indent_by = 6)
dataset |>
      dplyr::group_by(some_variable) |>
      dplyr::summarise(
            mean(mean_some_variable = mean(some_variable))
      )

grkstyle

Some may like even further refinement and can supplement styler with grkstyle. The readme.md in its repo contains the following example:

Unstyled

do_something_very_complicated(something = "that", requires = many,
                              arguments = "some of which may be long")

Styled

do_something_very_complicated(
  something = "that",
  requires = many,
  arguments = "some of which may be long"
) 

For those who want the configuration to be the default, you can add a line in the .Rprofile file like:

#https://github.com/gadenbuie/grkstyle
options(styler.addins_style_transformer = "grkstyle::grk_style_transformer(indent_by = 6)")

Keyboard Shortcut

Rstudio publishes a helpful “how-to-modify-a-keyboard-shortcut” article. Go to Tools > Keyboard Shortcuts Help to see all of the shortcuts. (There’s a lot!) You can create a shortcut for any Addins. The “Addins” are at the bottom of the list.

Go to Tools > Modify Keyboard Shortcuts. Scroll to the shortcut entitled “style active file”. Click on the shortcut column and enter a memorable key combination.

Choose memorable key combination.  Here, I used "s" to remind me of the word "style".

Figure 3: Choose memorable key combination. Here, I used “s” to remind me of the word “style”.

GitHub Workflow

You can also style your code upon a push to GitHub. A library of GitHub workflows for R code is maintained at https://github.com/r-lib/actions/tree/v2. There is a workflow example for styling your code with styler. You can also find other examples by searching for path:.github/workflows styler.

Conclusion

Styling your code consistently is the key to programming effectively. While you don’t have to follow a style guide, they contain helpful standards and accepted norms that allow others to quickly understand your program and potentially collaborate. The styler and grkstyle packages are helpful in that they can be quickly set up to follow the tidyverse style. A keyboard shortcut can further speed your coding. Should you decide to set up a GitHub workflow, additional resources are available.

References

[1]
C.-Y. Yen, M. H.-W. Chang, and C. Chan, “A Computational Analysis of the Dynamics of R Style Based on 94 Million Lines of Code from All CRAN Packages in the Past 20 Years,” 2019.
[2]
K. Müller and L. Walthert, Styler: Non-invasive pretty printing of r code. 2022 [Online]. Available: https://CRAN.R-project.org/package=styler
[3]
R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2021 [Online]. Available: https://www.R-project.org/
[4]
Y. Xie, C. Dervieux, and A. Presmanes Hill, Blogdown: Create blogs and websites with r markdown. 2022 [Online]. Available: https://CRAN.R-project.org/package=blogdown
[5]
G. Aden-Buie, Grkstyle: A tidy r code style. 2022 [Online]. Available: https://github.com/gadenbuie/grkstyle
[6]
H. Wickham, Tidyverse: Easily install and load the tidyverse. 2021 [Online]. Available: https://CRAN.R-project.org/package=tidyverse

Disclaimer

The views, analysis, and conclusions presented within this paper represent the authors alone and not of any other person, organization, or government entity. While I have made every reasonable effort to ensure that the information in this article was correct, it will nonetheless contain errors, inaccuracies, and inconsistencies. It is a working paper subject to revision without notice as additional information becomes available. Any liability is disclaimed as to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause. The author(s) received no financial support for the research, authorship, and/or publication of this article.

Reproducibility

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.1.0 (2021-05-18)
 os       macOS Big Sur 10.16
 system   x86_64, darwin17.0
 ui       X11
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       America/Chicago
 date     2022-03-29
 pandoc   2.14.1 @ /usr/local/bin/ (via rmarkdown)

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version    date (UTC) lib source
 assertthat    0.2.1      2019-03-21 [1] CRAN (R 4.1.0)
 blogdown    * 1.8        2022-02-16 [1] CRAN (R 4.1.2)
 bookdown      0.25       2022-03-16 [1] CRAN (R 4.1.2)
 brio          1.1.3      2021-11-30 [1] CRAN (R 4.1.0)
 bslib         0.3.1.9000 2022-03-04 [1] Github (rstudio/bslib@888fbe0)
 cachem        1.0.6      2021-08-19 [1] CRAN (R 4.1.0)
 callr         3.7.0      2021-04-20 [1] CRAN (R 4.1.0)
 cli           3.2.0      2022-02-14 [1] CRAN (R 4.1.2)
 codetools     0.2-18     2020-11-04 [1] CRAN (R 4.1.0)
 colorspace    2.0-3      2022-02-21 [1] CRAN (R 4.1.2)
 crayon        1.5.1      2022-03-26 [1] CRAN (R 4.1.0)
 DBI           1.1.2      2021-12-20 [1] CRAN (R 4.1.0)
 desc          1.4.1      2022-03-06 [1] CRAN (R 4.1.2)
 devtools    * 2.4.3      2021-11-30 [1] CRAN (R 4.1.0)
 digest        0.6.29     2021-12-01 [1] CRAN (R 4.1.0)
 dplyr         1.0.8      2022-02-08 [1] CRAN (R 4.1.2)
 ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)
 evaluate      0.15       2022-02-18 [1] CRAN (R 4.1.2)
 fansi         1.0.3      2022-03-24 [1] CRAN (R 4.1.2)
 fastmap       1.1.0      2021-01-25 [1] CRAN (R 4.1.0)
 fs            1.5.2      2021-12-08 [1] CRAN (R 4.1.0)
 generics      0.1.2      2022-01-31 [1] CRAN (R 4.1.2)
 ggplot2     * 3.3.5      2021-06-25 [1] CRAN (R 4.1.0)
 ggthemes    * 4.2.4      2021-01-20 [1] CRAN (R 4.1.0)
 glue          1.6.2      2022-02-24 [1] CRAN (R 4.1.2)
 gtable        0.3.0      2019-03-25 [1] CRAN (R 4.1.0)
 htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.1.0)
 jquerylib     0.1.4      2021-04-26 [1] CRAN (R 4.1.0)
 jsonlite      1.8.0      2022-02-22 [1] CRAN (R 4.1.2)
 knitr         1.38       2022-03-25 [1] CRAN (R 4.1.0)
 lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.1.0)
 magrittr      2.0.2      2022-01-26 [1] CRAN (R 4.1.2)
 memoise       2.0.1      2021-11-26 [1] CRAN (R 4.1.0)
 munsell       0.5.0.9000 2021-10-19 [1] Github (cwickham/munsell@e539541)
 pillar        1.7.0      2022-02-01 [1] CRAN (R 4.1.2)
 pkgbuild      1.3.1      2021-12-20 [1] CRAN (R 4.1.0)
 pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.0)
 pkgload       1.2.4      2021-11-30 [1] CRAN (R 4.1.0)
 prettycode    1.1.0      2019-12-16 [1] CRAN (R 4.1.0)
 prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.1.0)
 processx      3.5.3      2022-03-25 [1] CRAN (R 4.1.0)
 ps            1.6.0      2021-02-28 [1] CRAN (R 4.1.0)
 purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.0)
 R.cache       0.15.0     2021-04-30 [1] CRAN (R 4.1.0)
 R.methodsS3   1.8.1      2020-08-26 [1] CRAN (R 4.1.0)
 R.oo          1.24.0     2020-08-26 [1] CRAN (R 4.1.0)
 R.utils       2.11.0     2021-09-26 [1] CRAN (R 4.1.0)
 R6            2.5.1      2021-08-19 [1] CRAN (R 4.1.0)
 remotes       2.4.2      2021-11-30 [1] CRAN (R 4.1.0)
 rlang         1.0.2      2022-03-04 [1] CRAN (R 4.1.2)
 rmarkdown     2.13       2022-03-10 [1] CRAN (R 4.1.2)
 rprojroot     2.0.2      2020-11-15 [1] CRAN (R 4.1.0)
 rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.0)
 sass          0.4.1      2022-03-23 [1] CRAN (R 4.1.2)
 scales        1.1.1      2020-05-11 [1] CRAN (R 4.1.0)
 sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.1.0)
 stringi       1.7.6      2021-11-29 [1] CRAN (R 4.1.0)
 stringr       1.4.0      2019-02-10 [1] CRAN (R 4.1.0)
 styler      * 1.7.0      2022-03-13 [1] CRAN (R 4.1.2)
 testthat      3.1.2      2022-01-20 [1] CRAN (R 4.1.2)
 tibble        3.1.6      2021-11-07 [1] CRAN (R 4.1.0)
 tidyselect    1.1.2      2022-02-21 [1] CRAN (R 4.1.2)
 usethis     * 2.1.5      2021-12-09 [1] CRAN (R 4.1.0)
 utf8          1.2.2      2021-07-24 [1] CRAN (R 4.1.0)
 vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)
 withr         2.5.0      2022-03-03 [1] CRAN (R 4.1.0)
 xfun          0.30       2022-03-02 [1] CRAN (R 4.1.2)
 yaml          2.3.5      2022-02-21 [1] CRAN (R 4.1.2)

 [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library

──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────