7 min read

Staying Organized in R


jumbled lego blocks
My code starts to look like this at line 40. Photo by Rick Mason on Unsplash.

View raw source for this post

Summary

When code starts to run into the dozens of lines, it can begin to look like a jumbled box of legos. The parts all go together, but you can easily forget the order and the ultimate goal of the project. Here’s a post to offer an organizational strategy for writing code in Rstudio.

Table of Contents

Overview

When I first started writing code, I might get to line 40 and be really confused about what I was doing and how the code worked. Worse, when you went back to make changes, you had to relearn the code. The pros repeat it ad infinitum: comment your work. Here’s a way to take your commenting to the next level.

Background

Commenting is accomplished by starting a line with #. A series of lines can be commented by opening the “Code” tab and selecting “Comment/Uncomment Lines.” The hot-keys for the action are shift-control-C. From YaRrr! The Pirate’s Guide to R, the author states:

I cannot stress enough how important it is to comment your code! Trust me, even if you don’t plan on sharing your code with anyone else, keep in mind that your future self will be reading it in the future.

Beginning in Rstudio version 1.4, commenting can be heirarchal like any other Rmarkdown document where the number of # sets the level of the outline. Then, you can open RStudio’s document outline pane. The line must be followed by four consecutive dashes like this ----.

Additionally, an R package called bannerCommenter can also add some panache to your scripts.[1] The package was written by Bill Venables who with B.D. Ripley wrote one of the preeminent textbooks “Modern Applied Statistics with S”.

An Example

My workflow pipeline almost always follows the outline below:

#################################################################
##                            Author                           ##
##                            date                             ##
##                             url                             ##
##                           contact                           ##
#################################################################
# 1.0 Libraries ----
## 1.1 tidyverse ----
## 1.2 zoo ----
# 2.0 Get data ----
## 2.1 Source 1 ----
## 2.2 Source 2 ----
# 3.0 Combine the data ----
# 4.0 Create features ----
## 4.1 Column 1 ----
## 4.2 Column 2 ----
# 5.0 Make plots ----
## 5.1 Lineplot ----
### 5.1.1 Save ----
## 5.2 Scatterplot ----
### 5.2.1 ----
# 6.0 Save new dataset----

Workspace

Notice the down  arrows beside the line numbers and the button for the outline pane.

Figure 1: Notice the down arrows beside the line numbers and the button for the outline pane.

The code above was copied and pasted into RStudio’s editor pane and the outline button activated. The outline feature gives a clear map as to the progress of the program.

Conclusion

Commenting your code has three primary benefits. First, it makes it more readable and understandable. Second, Rstudio’s outline pane organizes the different parts and presents them in a heirarchal fashion. Third, the arrows can be used to collapse the code between comment lines. Lastly, bannerCommenter can be used in conjunction with outline to bring attention to key features. And if you fail to do it, you’re going to end up with a box of jumbled legos.

Acknowledgements

This blog post was made possible thanks to:

References

[1]
B. Venables, bannerCommenter: Make banner comments with a consistent format. 2021 [Online]. Available: https://CRAN.R-project.org/package=bannerCommenter
[2]
R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2021 [Online]. Available: https://www.R-project.org/
[3]
Y. Xie, C. Dervieux, and A. Presmanes Hill, Blogdown: Create blogs and websites with r markdown. 2021.
[4]
H. Wickham, Tidyverse: Easily install and load the tidyverse. 2021 [Online]. Available: https://CRAN.R-project.org/package=tidyverse

Disclaimer

The views, analysis and conclusions presented within this paper represent the author’s alone and not of any other person, organization or government entity. While I have made every reasonable effort to ensure that the information in this article was correct, it will nonetheless contain errors, inaccuracies and inconsistencies. It is a working paper subject to revision without notice as additional information becomes available. Any liability is disclaimed as to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause. The author(s) received no financial support for the research, authorship, and/or publication of this article.

Reproducibility

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 4.1.0 (2021-05-18)
 os       macOS Catalina 10.15.7      
 system   x86_64, darwin17.0          
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/Chicago             
 date     2021-07-20                  

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date       lib source                           
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.1.0)                   
 backports     1.2.1   2020-12-09 [1] CRAN (R 4.1.0)                   
 blogdown    * 1.3.2   2021-07-06 [1] Github (rstudio/blogdown@00a2090)
 bookdown      0.22    2021-04-22 [1] CRAN (R 4.1.0)                   
 broom         0.7.8   2021-06-24 [1] CRAN (R 4.1.0)                   
 bslib         0.2.5.1 2021-05-18 [1] CRAN (R 4.1.0)                   
 cachem        1.0.5   2021-05-15 [1] CRAN (R 4.1.0)                   
 callr         3.7.0   2021-04-20 [1] CRAN (R 4.1.0)                   
 cellranger    1.1.0   2016-07-27 [1] CRAN (R 4.1.0)                   
 cli           3.0.1   2021-07-17 [1] CRAN (R 4.1.0)                   
 codetools     0.2-18  2020-11-04 [1] CRAN (R 4.1.0)                   
 colorspace    2.0-2   2021-06-24 [1] CRAN (R 4.1.0)                   
 crayon        1.4.1   2021-02-08 [1] CRAN (R 4.1.0)                   
 DBI           1.1.1   2021-01-15 [1] CRAN (R 4.1.0)                   
 dbplyr        2.1.1   2021-04-06 [1] CRAN (R 4.1.0)                   
 desc          1.3.0   2021-03-05 [1] CRAN (R 4.1.0)                   
 devtools    * 2.4.2   2021-06-07 [1] CRAN (R 4.1.0)                   
 digest        0.6.27  2020-10-24 [1] CRAN (R 4.1.0)                   
 dplyr       * 1.0.7   2021-06-18 [1] CRAN (R 4.1.0)                   
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)                   
 evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)                   
 fansi         0.5.0   2021-05-25 [1] CRAN (R 4.1.0)                   
 farver        2.1.0   2021-02-28 [1] CRAN (R 4.1.0)                   
 fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)                   
 forcats     * 0.5.1   2021-01-27 [1] CRAN (R 4.1.0)                   
 fs            1.5.0   2020-07-31 [1] CRAN (R 4.1.0)                   
 generics      0.1.0   2020-10-31 [1] CRAN (R 4.1.0)                   
 ggplot2     * 3.3.4   2021-06-16 [1] CRAN (R 4.1.0)                   
 ggthemes    * 4.2.4   2021-01-20 [1] CRAN (R 4.1.0)                   
 glue          1.4.2   2020-08-27 [1] CRAN (R 4.1.0)                   
 gtable        0.3.0   2019-03-25 [1] CRAN (R 4.1.0)                   
 haven         2.4.1   2021-04-23 [1] CRAN (R 4.1.0)                   
 highr         0.9     2021-04-16 [1] CRAN (R 4.1.0)                   
 hms           1.1.0   2021-05-17 [1] CRAN (R 4.1.0)                   
 htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 4.1.0)                   
 httr          1.4.2   2020-07-20 [1] CRAN (R 4.1.0)                   
 jquerylib     0.1.4   2021-04-26 [1] CRAN (R 4.1.0)                   
 jsonlite      1.7.2   2020-12-09 [1] CRAN (R 4.1.0)                   
 knitr         1.33    2021-04-24 [1] CRAN (R 4.1.0)                   
 labeling      0.4.2   2020-10-20 [1] CRAN (R 4.1.0)                   
 lifecycle     1.0.0   2021-02-15 [1] CRAN (R 4.1.0)                   
 lubridate     1.7.10  2021-02-26 [1] CRAN (R 4.1.0)                   
 magrittr    * 2.0.1   2020-11-17 [1] CRAN (R 4.1.0)                   
 memoise       2.0.0   2021-01-26 [1] CRAN (R 4.1.0)                   
 modelr        0.1.8   2020-05-19 [1] CRAN (R 4.1.0)                   
 munsell       0.5.0   2018-06-12 [1] CRAN (R 4.1.0)                   
 pillar        1.6.1   2021-05-16 [1] CRAN (R 4.1.0)                   
 pkgbuild      1.2.0   2020-12-15 [1] CRAN (R 4.1.0)                   
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)                   
 pkgload       1.2.1   2021-04-06 [1] CRAN (R 4.1.0)                   
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 4.1.0)                   
 processx      3.5.2   2021-04-30 [1] CRAN (R 4.1.0)                   
 prompt      * 1.0.1   2021-03-12 [1] CRAN (R 4.1.0)                   
 ps            1.6.0   2021-02-28 [1] CRAN (R 4.1.0)                   
 purrr       * 0.3.4   2020-04-17 [1] CRAN (R 4.1.0)                   
 R6            2.5.0   2020-10-28 [1] CRAN (R 4.1.0)                   
 Rcpp          1.0.7   2021-07-07 [1] CRAN (R 4.1.0)                   
 readr       * 1.4.0   2020-10-05 [1] CRAN (R 4.1.0)                   
 readxl        1.3.1   2019-03-13 [1] CRAN (R 4.1.0)                   
 remotes       2.4.0   2021-06-02 [1] CRAN (R 4.1.0)                   
 reprex        2.0.0   2021-04-02 [1] CRAN (R 4.1.0)                   
 rlang         0.4.11  2021-04-30 [1] CRAN (R 4.1.0)                   
 rmarkdown     2.9     2021-06-15 [1] CRAN (R 4.1.0)                   
 roxygen2    * 7.1.1   2020-06-27 [1] CRAN (R 4.1.0)                   
 rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.1.0)                   
 rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)                   
 rvest         1.0.0   2021-03-09 [1] CRAN (R 4.1.0)                   
 sass          0.4.0   2021-05-12 [1] CRAN (R 4.1.0)                   
 scales        1.1.1   2020-05-11 [1] CRAN (R 4.1.0)                   
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 4.1.0)                   
 stringi       1.7.3   2021-07-16 [1] CRAN (R 4.1.0)                   
 stringr     * 1.4.0   2019-02-10 [1] CRAN (R 4.1.0)                   
 testthat      3.0.3   2021-06-16 [1] CRAN (R 4.1.0)                   
 tibble      * 3.1.2   2021-05-16 [1] CRAN (R 4.1.0)                   
 tidyr       * 1.1.3   2021-03-03 [1] CRAN (R 4.1.0)                   
 tidyselect    1.1.1   2021-04-30 [1] CRAN (R 4.1.0)                   
 tidyverse   * 1.3.1   2021-04-15 [1] CRAN (R 4.1.0)                   
 usethis     * 2.0.1   2021-02-10 [1] CRAN (R 4.1.0)                   
 utf8          1.2.1   2021-03-12 [1] CRAN (R 4.1.0)                   
 vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)                   
 withr         2.4.2   2021-04-18 [1] CRAN (R 4.1.0)                   
 xfun          0.24    2021-06-15 [1] CRAN (R 4.1.0)                   
 xml2          1.3.2   2020-04-23 [1] CRAN (R 4.1.0)                   
 yaml          2.2.1   2020-02-01 [1] CRAN (R 4.1.0)                   

[1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library