Summary
This post is a distillation of the tidymodels tutorials on machine learning. Formatting from the original was shortened and converted to a script with comments. Original tutorials are here. This post is basically my notes from the those tutorials.Table of Contents
Overview
It’s been a while since I’d done any modeling so, quite on accident, I’d discovered that Rstudio was promoting a new package called tidymodels
. Prior to stumbling into the new package, the caret
package, by Max Kuhn, was the default for any modeling that I did. So it was a great comfort to learn that he released an updated, vegetable-themed package parsnip
.
There’s a set of five tutorials in the tidymodel
to get you started. They’re extremely helpful and should be viewed prior to further reading of this post. In order to absorb the lessons, I essentially distilled the narrative and code blocks into scripts with comments only. Using the new outline feature in Rstudio 1.4 (I hadn’t updated Rstudio in a while) the new commenting paradigm gave structure to each lesson.
They’ve been included as gist embeds so the text can be copied and pasted into your script. Make sure to hit the outline button in Rstudio so you can see the outline headings.
Conclusion
Everybody learns in their own way. Tutorials are really helpful and the tidymodels
tutorials are great. After finishing those up, I hope to spend some time with the book, Tidy Modeling in R, and then do some more examples. No matter how many examples I try, I don’t ever feel comfortable modeling data. By converting the narrative format into R scripts it should make it easier to copy and paste and step through the code on your own. Enjoy.
Acknowledgements
This blog post was made possible thanks to:
Tidymodels: tidy machine learning in R by Rebecca Barter
References
Disclaimer
The views, analysis and conclusions presented within this paper represent the author’s alone and not of any other person, organization or government entity. While I have made every reasonable effort to ensure that the information in this article was correct, it will nonetheless contain errors, inaccuracies and inconsistencies. It is a working paper subject to revision without notice as additional information becomes available. Any liability is disclaimed as to any party for any loss, damage, or disruption caused by errors or omissions, whether such errors or omissions result from negligence, accident, or any other cause. The author(s) received no financial support for the research, authorship, and/or publication of this article.
Reproducibility
─ Session info ───────────────────────────────────────────────────────────────────────────────────────────────────────
setting value
version R version 3.6.3 (2020-02-29)
os macOS Catalina 10.15.7
system x86_64, darwin15.6.0
ui X11
language (EN)
collate en_US.UTF-8
ctype en_US.UTF-8
tz America/Chicago
date 2021-05-30
─ Packages ───────────────────────────────────────────────────────────────────────────────────────────────────────────
package * version date lib source
assertthat 0.2.1 2019-03-21 [1] CRAN (R 3.6.0)
blogdown * 1.3 2021-04-14 [1] CRAN (R 3.6.2)
bookdown 0.21 2020-10-13 [1] CRAN (R 3.6.3)
bslib 0.2.4 2021-01-25 [1] CRAN (R 3.6.2)
cachem 1.0.4 2021-02-13 [1] CRAN (R 3.6.2)
callr 3.5.1 2020-10-13 [1] CRAN (R 3.6.2)
cli 2.5.0 2021-04-26 [1] CRAN (R 3.6.2)
codetools 0.2-18 2020-11-04 [1] CRAN (R 3.6.2)
colorspace 2.0-1 2021-05-04 [1] CRAN (R 3.6.2)
crayon 1.4.1 2021-02-08 [1] CRAN (R 3.6.2)
DBI 1.1.1 2021-01-15 [1] CRAN (R 3.6.2)
desc 1.3.0 2021-03-05 [1] CRAN (R 3.6.3)
devtools * 2.3.2 2020-09-18 [1] CRAN (R 3.6.2)
digest 0.6.27 2020-10-24 [1] CRAN (R 3.6.2)
dplyr 1.0.5 2021-03-05 [1] CRAN (R 3.6.3)
ellipsis 0.3.2 2021-04-29 [1] CRAN (R 3.6.2)
evaluate 0.14 2019-05-28 [1] CRAN (R 3.6.0)
fansi 0.4.2 2021-01-15 [1] CRAN (R 3.6.2)
farver 2.1.0 2021-02-28 [1] CRAN (R 3.6.3)
fastmap 1.1.0 2021-01-25 [1] CRAN (R 3.6.2)
fs 1.5.0 2020-07-31 [1] CRAN (R 3.6.2)
generics 0.1.0 2020-10-31 [1] CRAN (R 3.6.2)
ggplot2 * 3.3.3 2020-12-30 [1] CRAN (R 3.6.2)
ggthemes * 4.2.4 2021-01-20 [1] CRAN (R 3.6.2)
glue 1.4.2 2020-08-27 [1] CRAN (R 3.6.2)
gtable 0.3.0 2019-03-25 [1] CRAN (R 3.6.0)
highr 0.8 2019-03-20 [1] CRAN (R 3.6.0)
htmltools 0.5.1.1 2021-01-22 [1] CRAN (R 3.6.2)
jquerylib 0.1.3 2020-12-17 [1] CRAN (R 3.6.2)
jsonlite 1.7.2 2020-12-09 [1] CRAN (R 3.6.2)
knitr 1.32 2021-04-14 [1] CRAN (R 3.6.2)
labeling 0.4.2 2020-10-20 [1] CRAN (R 3.6.2)
lifecycle 1.0.0 2021-02-15 [1] CRAN (R 3.6.2)
magrittr 2.0.1 2020-11-17 [1] CRAN (R 3.6.2)
memoise 2.0.0 2021-01-26 [1] CRAN (R 3.6.2)
munsell 0.5.0 2018-06-12 [1] CRAN (R 3.6.0)
pillar 1.6.0 2021-04-13 [1] CRAN (R 3.6.2)
pkgbuild 1.2.0 2020-12-15 [1] CRAN (R 3.6.2)
pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 3.6.0)
pkgload 1.2.0 2021-02-23 [1] CRAN (R 3.6.3)
prettyunits 1.1.1 2020-01-24 [1] CRAN (R 3.6.0)
processx 3.4.5 2020-11-30 [1] CRAN (R 3.6.2)
ps 1.6.0 2021-02-28 [1] CRAN (R 3.6.3)
purrr 0.3.4 2020-04-17 [1] CRAN (R 3.6.2)
R6 2.5.0 2020-10-28 [1] CRAN (R 3.6.2)
remotes 2.3.0 2021-04-01 [1] CRAN (R 3.6.2)
rlang 0.4.11 2021-04-30 [1] CRAN (R 3.6.2)
rmarkdown 2.7 2021-02-19 [1] CRAN (R 3.6.3)
rprojroot 2.0.2 2020-11-15 [1] CRAN (R 3.6.2)
sass 0.3.1 2021-01-24 [1] CRAN (R 3.6.2)
scales 1.1.1 2020-05-11 [1] CRAN (R 3.6.2)
sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 3.6.0)
stringi 1.5.3 2020-09-09 [1] CRAN (R 3.6.2)
stringr 1.4.0 2019-02-10 [1] CRAN (R 3.6.0)
testthat 3.0.2 2021-02-14 [1] CRAN (R 3.6.2)
tibble 3.1.1 2021-04-18 [1] CRAN (R 3.6.2)
tidyselect 1.1.0 2020-05-11 [1] CRAN (R 3.6.2)
usethis * 2.0.1 2021-02-10 [1] CRAN (R 3.6.2)
utf8 1.2.1 2021-03-12 [1] CRAN (R 3.6.2)
vctrs 0.3.8 2021-04-29 [1] CRAN (R 3.6.2)
withr 2.4.2 2021-04-18 [1] CRAN (R 3.6.2)
xfun 0.22 2021-03-11 [1] CRAN (R 3.6.2)
yaml 2.2.1 2020-02-01 [1] CRAN (R 3.6.0)
[1] /Library/Frameworks/R.framework/Versions/3.6/Resources/library