5 min read

Tidymodel Tutorials As Scripts


tidymodels packages
The tidymodels packages

View raw source for this post

Summary

This post is a distillation of the tidymodels tutorials on machine learning. Formatting from the original was shortened and converted to a script with comments. Original tutorials are here. This post is basically my notes from the those tutorials.

Table of Contents

Overview

It’s been a while since I’d done any modeling so, quite on accident, I’d discovered that Rstudio was promoting a new package called tidymodels. Prior to stumbling into the new package, the caret package, by Max Kuhn, was the default for any modeling that I did. So it was a great comfort to learn that he released an updated, vegetable-themed package parsnip.

There’s a set of five tutorials in the tidymodel to get you started. They’re extremely helpful and should be viewed prior to further reading of this post. In order to absorb the lessons, I essentially distilled the narrative and code blocks into scripts with comments only. Using the new outline feature in Rstudio 1.4 (I hadn’t updated Rstudio in a while) the new commenting paradigm gave structure to each lesson.

They’ve been included as gist embeds so the text can be copied and pasted into your script. Make sure to hit the outline button in Rstudio so you can see the outline headings.

Show document outline button in Rstudio

Figure 1: Show document outline button in Rstudio

Conclusion

Everybody learns in their own way. Tutorials are really helpful and the tidymodels tutorials are great. After finishing those up, I hope to spend some time with the book, Tidy Modeling in R, and then do some more examples. No matter how many examples I try, I don’t ever feel comfortable modeling data. By converting the narrative format into R scripts it should make it easier to copy and paste and step through the code on your own. Enjoy.

Acknowledgements

This blog post was made possible thanks to:

References

[1]
R Core Team, R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing, 2020 [Online]. Available: https://www.R-project.org/
[2]
Y. Xie, C. Dervieux, and A. Presmanes Hill, Blogdown: Create blogs and websites with r markdown. 2021 [Online]. Available: https://CRAN.R-project.org/package=blogdown
[3]
M. Kuhn and H. Wickham, Tidymodels: Easily install and load the tidymodels packages. 2021 [Online]. Available: https://CRAN.R-project.org/package=tidymodels
[4]
H. Wickham, Tidyverse: Easily install and load the tidyverse. 2019 [Online]. Available: https://CRAN.R-project.org/package=tidyverse

Disclaimer

The views, analysis and conclusions presented within this paper represent the author’s alone and not of any other person, organization or government entity. While I have made every reasonable effort to ensure that the information in this article was correct, it will nonetheless contain errors, inaccuracies and inconsistencies. It is a working paper subject to revision without notice as additional information becomes available. Any liability is disclaimed as to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause. The author(s) received no financial support for the research, authorship, and/or publication of this article.

Reproducibility

─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value                       
 version  R version 3.6.3 (2020-02-29)
 os       macOS Catalina 10.15.7      
 system   x86_64, darwin15.6.0        
 ui       X11                         
 language (EN)                        
 collate  en_US.UTF-8                 
 ctype    en_US.UTF-8                 
 tz       America/Chicago             
 date     2021-05-30                  

─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date       lib source        
 assertthat    0.2.1   2019-03-21 [1] CRAN (R 3.6.0)
 blogdown    * 1.3     2021-04-14 [1] CRAN (R 3.6.2)
 bookdown      0.21    2020-10-13 [1] CRAN (R 3.6.3)
 bslib         0.2.4   2021-01-25 [1] CRAN (R 3.6.2)
 cachem        1.0.4   2021-02-13 [1] CRAN (R 3.6.2)
 callr         3.5.1   2020-10-13 [1] CRAN (R 3.6.2)
 cli           2.5.0   2021-04-26 [1] CRAN (R 3.6.2)
 codetools     0.2-18  2020-11-04 [1] CRAN (R 3.6.2)
 colorspace    2.0-1   2021-05-04 [1] CRAN (R 3.6.2)
 crayon        1.4.1   2021-02-08 [1] CRAN (R 3.6.2)
 DBI           1.1.1   2021-01-15 [1] CRAN (R 3.6.2)
 desc          1.3.0   2021-03-05 [1] CRAN (R 3.6.3)
 devtools    * 2.3.2   2020-09-18 [1] CRAN (R 3.6.2)
 digest        0.6.27  2020-10-24 [1] CRAN (R 3.6.2)
 dplyr         1.0.5   2021-03-05 [1] CRAN (R 3.6.3)
 ellipsis      0.3.2   2021-04-29 [1] CRAN (R 3.6.2)
 evaluate      0.14    2019-05-28 [1] CRAN (R 3.6.0)
 fansi         0.4.2   2021-01-15 [1] CRAN (R 3.6.2)
 farver        2.1.0   2021-02-28 [1] CRAN (R 3.6.3)
 fastmap       1.1.0   2021-01-25 [1] CRAN (R 3.6.2)
 fs            1.5.0   2020-07-31 [1] CRAN (R 3.6.2)
 generics      0.1.0   2020-10-31 [1] CRAN (R 3.6.2)
 ggplot2     * 3.3.3   2020-12-30 [1] CRAN (R 3.6.2)
 ggthemes    * 4.2.4   2021-01-20 [1] CRAN (R 3.6.2)
 glue          1.4.2   2020-08-27 [1] CRAN (R 3.6.2)
 gtable        0.3.0   2019-03-25 [1] CRAN (R 3.6.0)
 highr         0.8     2019-03-20 [1] CRAN (R 3.6.0)
 htmltools     0.5.1.1 2021-01-22 [1] CRAN (R 3.6.2)
 jquerylib     0.1.3   2020-12-17 [1] CRAN (R 3.6.2)
 jsonlite      1.7.2   2020-12-09 [1] CRAN (R 3.6.2)
 knitr         1.32    2021-04-14 [1] CRAN (R 3.6.2)
 labeling      0.4.2   2020-10-20 [1] CRAN (R 3.6.2)
 lifecycle     1.0.0   2021-02-15 [1] CRAN (R 3.6.2)
 magrittr      2.0.1   2020-11-17 [1] CRAN (R 3.6.2)
 memoise       2.0.0   2021-01-26 [1] CRAN (R 3.6.2)
 munsell       0.5.0   2018-06-12 [1] CRAN (R 3.6.0)
 pillar        1.6.0   2021-04-13 [1] CRAN (R 3.6.2)
 pkgbuild      1.2.0   2020-12-15 [1] CRAN (R 3.6.2)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 3.6.0)
 pkgload       1.2.0   2021-02-23 [1] CRAN (R 3.6.3)
 prettyunits   1.1.1   2020-01-24 [1] CRAN (R 3.6.0)
 processx      3.4.5   2020-11-30 [1] CRAN (R 3.6.2)
 ps            1.6.0   2021-02-28 [1] CRAN (R 3.6.3)
 purrr         0.3.4   2020-04-17 [1] CRAN (R 3.6.2)
 R6            2.5.0   2020-10-28 [1] CRAN (R 3.6.2)
 remotes       2.3.0   2021-04-01 [1] CRAN (R 3.6.2)
 rlang         0.4.11  2021-04-30 [1] CRAN (R 3.6.2)
 rmarkdown     2.7     2021-02-19 [1] CRAN (R 3.6.3)
 rprojroot     2.0.2   2020-11-15 [1] CRAN (R 3.6.2)
 sass          0.3.1   2021-01-24 [1] CRAN (R 3.6.2)
 scales        1.1.1   2020-05-11 [1] CRAN (R 3.6.2)
 sessioninfo   1.1.1   2018-11-05 [1] CRAN (R 3.6.0)
 stringi       1.5.3   2020-09-09 [1] CRAN (R 3.6.2)
 stringr       1.4.0   2019-02-10 [1] CRAN (R 3.6.0)
 testthat      3.0.2   2021-02-14 [1] CRAN (R 3.6.2)
 tibble        3.1.1   2021-04-18 [1] CRAN (R 3.6.2)
 tidyselect    1.1.0   2020-05-11 [1] CRAN (R 3.6.2)
 usethis     * 2.0.1   2021-02-10 [1] CRAN (R 3.6.2)
 utf8          1.2.1   2021-03-12 [1] CRAN (R 3.6.2)
 vctrs         0.3.8   2021-04-29 [1] CRAN (R 3.6.2)
 withr         2.4.2   2021-04-18 [1] CRAN (R 3.6.2)
 xfun          0.22    2021-03-11 [1] CRAN (R 3.6.2)
 yaml          2.2.1   2020-02-01 [1] CRAN (R 3.6.0)

[1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library