diff --git a/session-tidymodels/_quarto.yml b/session-tidymodels/_quarto.yml
index 8243c67..23333ce 100644
--- a/session-tidymodels/_quarto.yml
+++ b/session-tidymodels/_quarto.yml
@@ -10,11 +10,10 @@ book:
     - index.qmd
     - intro.qmd
     - case-study.qmd
-    - exercises.qmd
+    #- exercises.qmd
     
 bibliography: references.bib
 
-
 format:
   html:
     theme: spacebar
diff --git a/session-tidymodels/case-study.qmd b/session-tidymodels/case-study.qmd
index bd09984..e0d69b9 100644
--- a/session-tidymodels/case-study.qmd
+++ b/session-tidymodels/case-study.qmd
@@ -134,7 +134,6 @@ hist(data_diabetes$BMI, xlab = "", main = "BMI: all", 50)
 hist(data_other$BMI, xlab = "", main = "BMI: non-test", 50)
 hist(data_test$BMI, xlab = "", main = "BMI: test", 50)
 
-
 ```
 
 ## Feature engineering
@@ -236,7 +235,8 @@ model_tune %>%
   
 # best lambda value (min. RMSE)
 model_best <- model_tune %>%
-  select_best("rmse")
+  select_best(metric = "rmse")
+
 print(model_best)
 
 # finalize workflow with tuned model
diff --git a/session-tidymodels/docs/case-study.html b/session-tidymodels/docs/case-study.html
new file mode 100644
index 0000000..700e941
--- /dev/null
+++ b/session-tidymodels/docs/case-study.html
@@ -0,0 +1,785 @@
+<!DOCTYPE html>
+<html xmlns="http://www.w3.org/1999/xhtml" lang="en" xml:lang="en"><head>
+
+<meta charset="utf-8">
+<meta name="generator" content="quarto-1.3.450">
+
+<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
+
+
+<title>Tidymodels - 2&nbsp; Demo: a predictive modelling case study</title>
+<style>
+code{white-space: pre-wrap;}
+span.smallcaps{font-variant: small-caps;}
+div.columns{display: flex; gap: min(4vw, 1.5em);}
+div.column{flex: auto; overflow-x: auto;}
+div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
+ul.task-list{list-style: none;}
+ul.task-list li input[type="checkbox"] {
+  width: 0.8em;
+  margin: 0 0.8em 0.2em -1em; /* quarto-specific, see https://github.com/quarto-dev/quarto-cli/issues/4556 */ 
+  vertical-align: middle;
+}
+/* CSS for syntax highlighting */
+pre > code.sourceCode { white-space: pre; position: relative; }
+pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
+pre > code.sourceCode > span:empty { height: 1.2em; }
+.sourceCode { overflow: visible; }
+code.sourceCode > span { color: inherit; text-decoration: inherit; }
+div.sourceCode { margin: 1em 0; }
+pre.sourceCode { margin: 0; }
+@media screen {
+div.sourceCode { overflow: auto; }
+}
+@media print {
+pre > code.sourceCode { white-space: pre-wrap; }
+pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
+}
+pre.numberSource code
+  { counter-reset: source-line 0; }
+pre.numberSource code > span
+  { position: relative; left: -4em; counter-increment: source-line; }
+pre.numberSource code > span > a:first-child::before
+  { content: counter(source-line);
+    position: relative; left: -1em; text-align: right; vertical-align: baseline;
+    border: none; display: inline-block;
+    -webkit-touch-callout: none; -webkit-user-select: none;
+    -khtml-user-select: none; -moz-user-select: none;
+    -ms-user-select: none; user-select: none;
+    padding: 0 4px; width: 4em;
+  }
+pre.numberSource { margin-left: 3em;  padding-left: 4px; }
+div.sourceCode
+  {   }
+@media screen {
+pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
+}
+</style>
+
+
+<script src="site_libs/quarto-nav/quarto-nav.js"></script>
+<script src="site_libs/quarto-nav/headroom.min.js"></script>
+<script src="site_libs/clipboard/clipboard.min.js"></script>
+<script src="site_libs/quarto-search/autocomplete.umd.js"></script>
+<script src="site_libs/quarto-search/fuse.min.js"></script>
+<script src="site_libs/quarto-search/quarto-search.js"></script>
+<meta name="quarto:offset" content="./">
+<link href="./intro.html" rel="prev">
+<script src="site_libs/quarto-html/quarto.js"></script>
+<script src="site_libs/quarto-html/popper.min.js"></script>
+<script src="site_libs/quarto-html/tippy.umd.min.js"></script>
+<script src="site_libs/quarto-html/anchor.min.js"></script>
+<link href="site_libs/quarto-html/tippy.css" rel="stylesheet">
+<link href="site_libs/quarto-html/quarto-syntax-highlighting.css" rel="stylesheet" id="quarto-text-highlighting-styles">
+<script src="site_libs/bootstrap/bootstrap.min.js"></script>
+<link href="site_libs/bootstrap/bootstrap-icons.css" rel="stylesheet">
+<link href="site_libs/bootstrap/bootstrap.min.css" rel="stylesheet" id="quarto-bootstrap" data-mode="light">
+<script id="quarto-search-options" type="application/json">{
+  "location": "sidebar",
+  "copy-button": false,
+  "collapse-after": 3,
+  "panel-placement": "start",
+  "type": "textbox",
+  "limit": 20,
+  "language": {
+    "search-no-results-text": "No results",
+    "search-matching-documents-text": "matching documents",
+    "search-copy-link-title": "Copy link to search",
+    "search-hide-matches-text": "Hide additional matches",
+    "search-more-match-text": "more match in this document",
+    "search-more-matches-text": "more matches in this document",
+    "search-clear-button-title": "Clear",
+    "search-detached-cancel-button-title": "Cancel",
+    "search-submit-button-title": "Submit",
+    "search-label": "Search"
+  }
+}</script>
+
+
+</head>
+
+<body class="nav-sidebar floating">
+
+<div id="quarto-search-results"></div>
+  <header id="quarto-header" class="headroom fixed-top">
+  <nav class="quarto-secondary-nav">
+    <div class="container-fluid d-flex">
+      <button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar,#quarto-sidebar-glass" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
+        <i class="bi bi-layout-text-sidebar-reverse"></i>
+      </button>
+      <nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="./case-study.html"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Demo: a predictive modelling case study</span></a></li></ol></nav>
+      <a class="flex-grow-1" role="button" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar,#quarto-sidebar-glass" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">      
+      </a>
+      <button type="button" class="btn quarto-search-button" aria-label="" onclick="window.quartoOpenSearch();">
+        <i class="bi bi-search"></i>
+      </button>
+    </div>
+  </nav>
+</header>
+<!-- content -->
+<div id="quarto-content" class="quarto-container page-columns page-rows-contents page-layout-article">
+<!-- sidebar -->
+  <nav id="quarto-sidebar" class="sidebar collapse collapse-horizontal sidebar-navigation floating overflow-auto">
+    <div class="pt-lg-2 mt-2 text-left sidebar-header">
+    <div class="sidebar-title mb-0 py-0">
+      <a href="./">Tidymodels</a> 
+    </div>
+      </div>
+        <div class="mt-2 flex-shrink-0 align-items-center">
+        <div class="sidebar-search">
+        <div id="quarto-search" class="" title="Search"></div>
+        </div>
+        </div>
+    <div class="sidebar-menu-container"> 
+    <ul class="list-unstyled mt-1">
+        <li class="sidebar-item">
+  <div class="sidebar-item-container"> 
+  <a href="./index.html" class="sidebar-item-text sidebar-link">
+ <span class="menu-text">Preface</span></a>
+  </div>
+</li>
+        <li class="sidebar-item">
+  <div class="sidebar-item-container"> 
+  <a href="./intro.html" class="sidebar-item-text sidebar-link">
+ <span class="menu-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></span></a>
+  </div>
+</li>
+        <li class="sidebar-item">
+  <div class="sidebar-item-container"> 
+  <a href="./case-study.html" class="sidebar-item-text sidebar-link active">
+ <span class="menu-text"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Demo: a predictive modelling case study</span></span></a>
+  </div>
+</li>
+    </ul>
+    </div>
+</nav>
+<div id="quarto-sidebar-glass" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar,#quarto-sidebar-glass"></div>
+<!-- margin-sidebar -->
+    <div id="quarto-margin-sidebar" class="sidebar margin-sidebar">
+        <nav id="TOC" role="doc-toc" class="toc-active">
+    <h2 id="toc-title">Table of contents</h2>
+   
+  <ul>
+  <li><a href="#data-import-eda" id="toc-data-import-eda" class="nav-link active" data-scroll-target="#data-import-eda"><span class="header-section-number">2.1</span> Data import &amp; EDA</a></li>
+  <li><a href="#data-splitting" id="toc-data-splitting" class="nav-link" data-scroll-target="#data-splitting"><span class="header-section-number">2.2</span> Data splitting</a></li>
+  <li><a href="#feature-engineering" id="toc-feature-engineering" class="nav-link" data-scroll-target="#feature-engineering"><span class="header-section-number">2.3</span> Feature engineering</a></li>
+  <li><a href="#lasso-regression" id="toc-lasso-regression" class="nav-link" data-scroll-target="#lasso-regression"><span class="header-section-number">2.4</span> Lasso regression</a></li>
+  </ul>
+</nav>
+    </div>
+<!-- main -->
+<main class="content" id="quarto-document-content">
+
+<header id="title-block-header" class="quarto-title-block default">
+<div class="quarto-title">
+<h1 class="title"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Demo: a predictive modelling case study</span></h1>
+</div>
+
+
+
+<div class="quarto-title-meta">
+
+    
+  
+    
+  </div>
+  
+
+</header>
+
+<p>Let’s use <code>tidymodels</code> framework to run small predictive case study trying to build a predictive model for BMI using our <code>diabetes</code> data set. We will use:</p>
+<ul>
+<li><p><code>rsamples</code> for splitting data into test and non-test, as well as creating cross-validation folds</p></li>
+<li><p><code>recipes</code> for feature engineering, e.g.&nbsp;changing from imperial to metric measurements, removing irrelevant and highly correlated features</p></li>
+<li><p><code>parsnip</code> to specify Lasso regression model</p></li>
+<li><p><code>tune</code> to optimize search space for lambda values</p></li>
+<li><p><code>yardstick</code> to assess predictions</p></li>
+<li><p><code>workflows</code> to put all the step together</p></li>
+</ul>
+<section id="data-import-eda" class="level2" data-number="2.1">
+<h2 data-number="2.1" class="anchored" data-anchor-id="data-import-eda"><span class="header-section-number">2.1</span> Data import &amp; EDA</h2>
+<div class="cell" data-fig-cap-location="margin">
+<div class="sourceCode cell-code" id="cb1"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co"># load libraries</span></span>
+<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(tidyverse)</span>
+<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(tidymodels)</span>
+<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ggcorrplot)</span>
+<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(reshape2)</span>
+<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(vip)</span>
+<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a><span class="co"># import raw data</span></span>
+<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>input_diabetes <span class="ot">&lt;-</span> <span class="fu">read_csv</span>(<span class="st">"data/data-diabetes.csv"</span>)</span>
+<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a><span class="co"># create BMI variable</span></span>
+<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>conv_factor <span class="ot">&lt;-</span> <span class="dv">703</span> <span class="co"># conversion factor to calculate BMI from inches and pounds BMI = weight (lb) / [height (in)]2 x 703</span></span>
+<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>data_diabetes <span class="ot">&lt;-</span> input_diabetes <span class="sc">%&gt;%</span></span>
+<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a>  <span class="fu">mutate</span>(<span class="at">BMI =</span> weight <span class="sc">/</span> height<span class="sc">^</span><span class="dv">2</span> <span class="sc">*</span> <span class="dv">703</span>, <span class="at">BMI =</span> <span class="fu">round</span>(BMI, <span class="dv">2</span>)) <span class="sc">%&gt;%</span></span>
+<span id="cb1-15"><a href="#cb1-15" aria-hidden="true" tabindex="-1"></a>  <span class="fu">relocate</span>(BMI, <span class="at">.after =</span> id)</span>
+<span id="cb1-16"><a href="#cb1-16" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-17"><a href="#cb1-17" aria-hidden="true" tabindex="-1"></a><span class="co"># preview data</span></span>
+<span id="cb1-18"><a href="#cb1-18" aria-hidden="true" tabindex="-1"></a><span class="fu">glimpse</span>(data_diabetes)</span>
+<span id="cb1-19"><a href="#cb1-19" aria-hidden="true" tabindex="-1"></a><span class="do">## Rows: 403</span></span>
+<span id="cb1-20"><a href="#cb1-20" aria-hidden="true" tabindex="-1"></a><span class="do">## Columns: 20</span></span>
+<span id="cb1-21"><a href="#cb1-21" aria-hidden="true" tabindex="-1"></a><span class="do">## $ id       &lt;dbl&gt; 1000, 1001, 1002, 1003, 1005, 1008, 1011, 1015, 1016, 1022, 1…</span></span>
+<span id="cb1-22"><a href="#cb1-22" aria-hidden="true" tabindex="-1"></a><span class="do">## $ BMI      &lt;dbl&gt; 22.13, 37.42, 48.37, 18.64, 27.82, 26.50, 28.20, 34.33, 24.51…</span></span>
+<span id="cb1-23"><a href="#cb1-23" aria-hidden="true" tabindex="-1"></a><span class="do">## $ chol     &lt;dbl&gt; 203, 165, 228, 78, 249, 248, 195, 227, 177, 263, 242, 215, 23…</span></span>
+<span id="cb1-24"><a href="#cb1-24" aria-hidden="true" tabindex="-1"></a><span class="do">## $ stab.glu &lt;dbl&gt; 82, 97, 92, 93, 90, 94, 92, 75, 87, 89, 82, 128, 75, 79, 76, …</span></span>
+<span id="cb1-25"><a href="#cb1-25" aria-hidden="true" tabindex="-1"></a><span class="do">## $ hdl      &lt;dbl&gt; 56, 24, 37, 12, 28, 69, 41, 44, 49, 40, 54, 34, 36, 46, 30, 4…</span></span>
+<span id="cb1-26"><a href="#cb1-26" aria-hidden="true" tabindex="-1"></a><span class="do">## $ ratio    &lt;dbl&gt; 3.6, 6.9, 6.2, 6.5, 8.9, 3.6, 4.8, 5.2, 3.6, 6.6, 4.5, 6.3, 6…</span></span>
+<span id="cb1-27"><a href="#cb1-27" aria-hidden="true" tabindex="-1"></a><span class="do">## $ glyhb    &lt;dbl&gt; 4.31, 4.44, 4.64, 4.63, 7.72, 4.81, 4.84, 3.94, 4.84, 5.78, 4…</span></span>
+<span id="cb1-28"><a href="#cb1-28" aria-hidden="true" tabindex="-1"></a><span class="do">## $ location &lt;chr&gt; "Buckingham", "Buckingham", "Buckingham", "Buckingham", "Buck…</span></span>
+<span id="cb1-29"><a href="#cb1-29" aria-hidden="true" tabindex="-1"></a><span class="do">## $ age      &lt;dbl&gt; 46, 29, 58, 67, 64, 34, 30, 37, 45, 55, 60, 38, 27, 40, 36, 3…</span></span>
+<span id="cb1-30"><a href="#cb1-30" aria-hidden="true" tabindex="-1"></a><span class="do">## $ gender   &lt;chr&gt; "female", "female", "female", "male", "male", "male", "male",…</span></span>
+<span id="cb1-31"><a href="#cb1-31" aria-hidden="true" tabindex="-1"></a><span class="do">## $ height   &lt;dbl&gt; 62, 64, 61, 67, 68, 71, 69, 59, 69, 63, 65, 58, 60, 59, 69, 6…</span></span>
+<span id="cb1-32"><a href="#cb1-32" aria-hidden="true" tabindex="-1"></a><span class="do">## $ weight   &lt;dbl&gt; 121, 218, 256, 119, 183, 190, 191, 170, 166, 202, 156, 195, 1…</span></span>
+<span id="cb1-33"><a href="#cb1-33" aria-hidden="true" tabindex="-1"></a><span class="do">## $ frame    &lt;chr&gt; "medium", "large", "large", "large", "medium", "large", "medi…</span></span>
+<span id="cb1-34"><a href="#cb1-34" aria-hidden="true" tabindex="-1"></a><span class="do">## $ bp.1s    &lt;dbl&gt; 118, 112, 190, 110, 138, 132, 161, NA, 160, 108, 130, 102, 13…</span></span>
+<span id="cb1-35"><a href="#cb1-35" aria-hidden="true" tabindex="-1"></a><span class="do">## $ bp.1d    &lt;dbl&gt; 59, 68, 92, 50, 80, 86, 112, NA, 80, 72, 90, 68, 80, NA, 66, …</span></span>
+<span id="cb1-36"><a href="#cb1-36" aria-hidden="true" tabindex="-1"></a><span class="do">## $ bp.2s    &lt;dbl&gt; NA, NA, 185, NA, NA, NA, 161, NA, 128, NA, 130, NA, NA, NA, N…</span></span>
+<span id="cb1-37"><a href="#cb1-37" aria-hidden="true" tabindex="-1"></a><span class="do">## $ bp.2d    &lt;dbl&gt; NA, NA, 92, NA, NA, NA, 112, NA, 86, NA, 90, NA, NA, NA, NA, …</span></span>
+<span id="cb1-38"><a href="#cb1-38" aria-hidden="true" tabindex="-1"></a><span class="do">## $ waist    &lt;dbl&gt; 29, 46, 49, 33, 44, 36, 46, 34, 34, 45, 39, 42, 35, 37, 36, 3…</span></span>
+<span id="cb1-39"><a href="#cb1-39" aria-hidden="true" tabindex="-1"></a><span class="do">## $ hip      &lt;dbl&gt; 38, 48, 57, 38, 41, 42, 49, 39, 40, 50, 45, 50, 41, 43, 40, 4…</span></span>
+<span id="cb1-40"><a href="#cb1-40" aria-hidden="true" tabindex="-1"></a><span class="do">## $ time.ppn &lt;dbl&gt; 720, 360, 180, 480, 300, 195, 720, 1020, 300, 240, 300, 90, 7…</span></span>
+<span id="cb1-41"><a href="#cb1-41" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-42"><a href="#cb1-42" aria-hidden="true" tabindex="-1"></a><span class="co"># run basic EDA</span></span>
+<span id="cb1-43"><a href="#cb1-43" aria-hidden="true" tabindex="-1"></a><span class="co"># note: we have seen descriptive statistics and plots during EDA session </span></span>
+<span id="cb1-44"><a href="#cb1-44" aria-hidden="true" tabindex="-1"></a><span class="co"># note: so here we only look at missing data and correlation</span></span>
+<span id="cb1-45"><a href="#cb1-45" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-46"><a href="#cb1-46" aria-hidden="true" tabindex="-1"></a><span class="co"># calculate number of missing data per variable</span></span>
+<span id="cb1-47"><a href="#cb1-47" aria-hidden="true" tabindex="-1"></a>data_na <span class="ot">&lt;-</span> data_diabetes <span class="sc">%&gt;%</span> </span>
+<span id="cb1-48"><a href="#cb1-48" aria-hidden="true" tabindex="-1"></a>  <span class="fu">summarise</span>(<span class="fu">across</span>(<span class="fu">everything</span>(), <span class="sc">~</span> <span class="fu">sum</span>(<span class="fu">is.na</span>(.)))) </span>
+<span id="cb1-49"><a href="#cb1-49" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-50"><a href="#cb1-50" aria-hidden="true" tabindex="-1"></a><span class="co"># make a table with counts sorted from highest to lowest</span></span>
+<span id="cb1-51"><a href="#cb1-51" aria-hidden="true" tabindex="-1"></a>data_na_long <span class="ot">&lt;-</span> data_na <span class="sc">%&gt;%</span></span>
+<span id="cb1-52"><a href="#cb1-52" aria-hidden="true" tabindex="-1"></a>  <span class="fu">pivot_longer</span>(<span class="sc">-</span>id, <span class="at">names_to =</span> <span class="st">"variable"</span>, <span class="at">values_to =</span> <span class="st">"count"</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb1-53"><a href="#cb1-53" aria-hidden="true" tabindex="-1"></a>  <span class="fu">arrange</span>(<span class="fu">desc</span>(count)) </span>
+<span id="cb1-54"><a href="#cb1-54" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-55"><a href="#cb1-55" aria-hidden="true" tabindex="-1"></a><span class="co"># make a column plot to visualize the counts</span></span>
+<span id="cb1-56"><a href="#cb1-56" aria-hidden="true" tabindex="-1"></a>data_na_long <span class="sc">%&gt;%</span></span>
+<span id="cb1-57"><a href="#cb1-57" aria-hidden="true" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(<span class="at">x =</span> variable, <span class="at">y =</span> count)) <span class="sc">+</span> </span>
+<span id="cb1-58"><a href="#cb1-58" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_col</span>(<span class="at">fill =</span> <span class="st">"blue4"</span>) <span class="sc">+</span> </span>
+<span id="cb1-59"><a href="#cb1-59" aria-hidden="true" tabindex="-1"></a>  <span class="fu">xlab</span>(<span class="st">""</span>) <span class="sc">+</span> </span>
+<span id="cb1-60"><a href="#cb1-60" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
+<span id="cb1-61"><a href="#cb1-61" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">axis.text.x =</span> <span class="fu">element_text</span>(<span class="at">angle =</span> <span class="dv">45</span>, <span class="at">vjust =</span> <span class="dv">1</span>, <span class="at">hjust=</span><span class="dv">1</span>))</span>
+<span id="cb1-62"><a href="#cb1-62" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-63"><a href="#cb1-63" aria-hidden="true" tabindex="-1"></a><span class="co"># calculate correlation between numeric variables</span></span>
+<span id="cb1-64"><a href="#cb1-64" aria-hidden="true" tabindex="-1"></a>data_cor <span class="ot">&lt;-</span> data_diabetes <span class="sc">%&gt;%</span> </span>
+<span id="cb1-65"><a href="#cb1-65" aria-hidden="true" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">select</span>(<span class="sc">-</span>id) <span class="sc">%&gt;%</span> </span>
+<span id="cb1-66"><a href="#cb1-66" aria-hidden="true" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">select</span>(<span class="fu">where</span>(is.numeric)) <span class="sc">%&gt;%</span></span>
+<span id="cb1-67"><a href="#cb1-67" aria-hidden="true" tabindex="-1"></a>  <span class="fu">cor</span>(<span class="at">use =</span> <span class="st">"pairwise.complete.obs"</span>)</span>
+<span id="cb1-68"><a href="#cb1-68" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-69"><a href="#cb1-69" aria-hidden="true" tabindex="-1"></a><span class="co"># visualize correlation via heatmap</span></span>
+<span id="cb1-70"><a href="#cb1-70" aria-hidden="true" tabindex="-1"></a><span class="fu">ggcorrplot</span>(data_cor, <span class="at">hc.order =</span> <span class="cn">TRUE</span>, <span class="at">lab =</span> <span class="cn">FALSE</span>)</span>
+<span id="cb1-71"><a href="#cb1-71" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb1-72"><a href="#cb1-72" aria-hidden="true" tabindex="-1"></a><span class="co"># based on the number of missing data, let's delete bp.2s, bp.2d</span></span>
+<span id="cb1-73"><a href="#cb1-73" aria-hidden="true" tabindex="-1"></a><span class="co"># and use complete-cases analysis </span></span>
+<span id="cb1-74"><a href="#cb1-74" aria-hidden="true" tabindex="-1"></a>data_diabetes_narm <span class="ot">&lt;-</span> data_diabetes <span class="sc">%&gt;%</span></span>
+<span id="cb1-75"><a href="#cb1-75" aria-hidden="true" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">select</span>(<span class="sc">-</span>bp<span class="fl">.2</span>s, <span class="sc">-</span>bp<span class="fl">.2</span>d) <span class="sc">%&gt;%</span></span>
+<span id="cb1-76"><a href="#cb1-76" aria-hidden="true" tabindex="-1"></a>  <span class="fu">na.omit</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/load-data-1.png" class="img-fluid figure-img" width="672"></p>
+<figcaption class="figure-caption">Number of missing data per variable, shows that bp.2d and bp.2s have more than 50% missing entries</figcaption>
+</figure>
+</div>
+</div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/load-data-2.png" class="img-fluid figure-img" width="672"></p>
+<figcaption class="figure-caption">Heatmap visualizing Pearson correlation coefficient between numerical variables</figcaption>
+</figure>
+</div>
+</div>
+</div>
+</section>
+<section id="data-splitting" class="level2" data-number="2.2">
+<h2 data-number="2.2" class="anchored" data-anchor-id="data-splitting"><span class="header-section-number">2.2</span> Data splitting</h2>
+<div class="cell" data-fig-cap-location="margin">
+<div class="sourceCode cell-code" id="cb2"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co"># use tidymodels framework to fit Lasso regression model for predicting BMI</span></span>
+<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="co"># using repeated cross-validation to tune lambda value in L1 penalty term</span></span>
+<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a><span class="co"># select random seed value</span></span>
+<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>myseed <span class="ot">&lt;-</span> <span class="dv">123</span></span>
+<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a><span class="co"># split data into non-test (other) and test (80% s)</span></span>
+<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(myseed)</span>
+<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>data_split <span class="ot">&lt;-</span> <span class="fu">initial_split</span>(data_diabetes_narm, <span class="at">strata =</span> BMI, <span class="at">prop =</span> <span class="fl">0.8</span>) <span class="co"># holds splitting info</span></span>
+<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>data_other <span class="ot">&lt;-</span> data_split <span class="sc">%&gt;%</span> <span class="fu">training</span>() <span class="co"># creates non-test set (function is called training but it refers to non-test part)</span></span>
+<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>data_test <span class="ot">&lt;-</span> data_split <span class="sc">%&gt;%</span> <span class="fu">testing</span>() <span class="co"># creates test set</span></span>
+<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a><span class="co"># prepare repeated cross-validation splits with 5 folds repeated 3 times</span></span>
+<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(myseed)</span>
+<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a>data_folds <span class="ot">&lt;-</span> <span class="fu">vfold_cv</span>(data_other,</span>
+<span id="cb2-16"><a href="#cb2-16" aria-hidden="true" tabindex="-1"></a>                       <span class="at">v =</span> <span class="dv">5</span>, </span>
+<span id="cb2-17"><a href="#cb2-17" aria-hidden="true" tabindex="-1"></a>                       <span class="at">repeats =</span> <span class="dv">3</span>,</span>
+<span id="cb2-18"><a href="#cb2-18" aria-hidden="true" tabindex="-1"></a>                       <span class="at">strata =</span> BMI)</span>
+<span id="cb2-19"><a href="#cb2-19" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-20"><a href="#cb2-20" aria-hidden="true" tabindex="-1"></a><span class="co"># check the split</span></span>
+<span id="cb2-21"><a href="#cb2-21" aria-hidden="true" tabindex="-1"></a><span class="fu">dim</span>(data_diabetes)</span>
+<span id="cb2-22"><a href="#cb2-22" aria-hidden="true" tabindex="-1"></a><span class="do">## [1] 403  20</span></span>
+<span id="cb2-23"><a href="#cb2-23" aria-hidden="true" tabindex="-1"></a><span class="fu">dim</span>(data_other)</span>
+<span id="cb2-24"><a href="#cb2-24" aria-hidden="true" tabindex="-1"></a><span class="do">## [1] 291  18</span></span>
+<span id="cb2-25"><a href="#cb2-25" aria-hidden="true" tabindex="-1"></a><span class="fu">dim</span>(data_test)</span>
+<span id="cb2-26"><a href="#cb2-26" aria-hidden="true" tabindex="-1"></a><span class="do">## [1] 75 18</span></span>
+<span id="cb2-27"><a href="#cb2-27" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb2-28"><a href="#cb2-28" aria-hidden="true" tabindex="-1"></a><span class="co"># check BMI distributions in data splits</span></span>
+<span id="cb2-29"><a href="#cb2-29" aria-hidden="true" tabindex="-1"></a><span class="fu">par</span>(<span class="at">mfrow=</span><span class="fu">c</span>(<span class="dv">3</span>,<span class="dv">1</span>))</span>
+<span id="cb2-30"><a href="#cb2-30" aria-hidden="true" tabindex="-1"></a><span class="fu">hist</span>(data_diabetes<span class="sc">$</span>BMI, <span class="at">xlab =</span> <span class="st">""</span>, <span class="at">main =</span> <span class="st">"BMI: all"</span>, <span class="dv">50</span>)</span>
+<span id="cb2-31"><a href="#cb2-31" aria-hidden="true" tabindex="-1"></a><span class="fu">hist</span>(data_other<span class="sc">$</span>BMI, <span class="at">xlab =</span> <span class="st">""</span>, <span class="at">main =</span> <span class="st">"BMI: non-test"</span>, <span class="dv">50</span>)</span>
+<span id="cb2-32"><a href="#cb2-32" aria-hidden="true" tabindex="-1"></a><span class="fu">hist</span>(data_test<span class="sc">$</span>BMI, <span class="at">xlab =</span> <span class="st">""</span>, <span class="at">main =</span> <span class="st">"BMI: test"</span>, <span class="dv">50</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/data-split-1.png" class="img-fluid figure-img" width="672"></p>
+<figcaption class="figure-caption">Distribution of BMI values given all data and spits into non-test and test</figcaption>
+</figure>
+</div>
+</div>
+</div>
+</section>
+<section id="feature-engineering" class="level2" data-number="2.3">
+<h2 data-number="2.3" class="anchored" data-anchor-id="feature-engineering"><span class="header-section-number">2.3</span> Feature engineering</h2>
+<div class="cell">
+<div class="sourceCode cell-code" id="cb3"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co"># create data recipe (feature engineering)</span></span>
+<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>inch2m <span class="ot">&lt;-</span> <span class="fl">2.54</span><span class="sc">/</span><span class="dv">100</span></span>
+<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>pound2kg <span class="ot">&lt;-</span> <span class="fl">0.45</span></span>
+<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a>data_recipe <span class="ot">&lt;-</span> <span class="fu">recipe</span>(BMI <span class="sc">~</span> ., <span class="at">data =</span> data_other) <span class="sc">%&gt;%</span></span>
+<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>  <span class="fu">update_role</span>(id, <span class="at">new_role =</span> <span class="st">"sampleID"</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_mutate</span>(<span class="at">height =</span> height <span class="sc">*</span> inch2m, <span class="at">height =</span> <span class="fu">round</span>(height, <span class="dv">2</span>)) <span class="sc">%&gt;%</span> <span class="co"># convert height to meters</span></span>
+<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_mutate</span>(<span class="at">weight =</span> weight <span class="sc">*</span> pound2kg, <span class="at">weight =</span> <span class="fu">round</span>(weight, <span class="dv">2</span>)) <span class="sc">%&gt;%</span> <span class="co"># convert weight to kg</span></span>
+<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_rename</span>(<span class="at">glu =</span> stab.glu) <span class="sc">%&gt;%</span> <span class="co"># rename stab.glu to glu</span></span>
+<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_log</span>(glu) <span class="sc">%&gt;%</span>  <span class="co">#ln transform glucose</span></span>
+<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_zv</span>(<span class="fu">all_numeric</span>()) <span class="sc">%&gt;%</span> <span class="co"># removes variables that are highly sparse and unbalanced (if found)</span></span>
+<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_corr</span>(<span class="fu">all_numeric</span>(), <span class="sc">-</span><span class="fu">all_outcomes</span>(), <span class="sc">-</span><span class="fu">has_role</span>(<span class="st">"sampleID"</span>), <span class="at">threshold =</span> <span class="fl">0.8</span>) <span class="sc">%&gt;%</span> <span class="co"># removes variables with large absolute correlations with other variables (if found)</span></span>
+<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_dummy</span>(location, gender, frame) <span class="sc">%&gt;%</span> <span class="co"># convert categorical variables to dummy variables</span></span>
+<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a>  <span class="fu">step_normalize</span>(<span class="fu">all_numeric</span>(), <span class="sc">-</span><span class="fu">all_outcomes</span>(), <span class="sc">-</span><span class="fu">has_role</span>(<span class="st">"sampleID"</span>), <span class="at">skip =</span> <span class="cn">FALSE</span>) </span>
+<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a>  </span>
+<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a>  <span class="co"># you can implement more steps: see https://recipes.tidymodels.org/reference/index.html</span></span>
+<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a><span class="co"># print recipe</span></span>
+<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a>data_recipe</span>
+<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a><span class="co"># check if recipe is doing what it is supposed to do</span></span>
+<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a><span class="co"># i.e. bake the data</span></span>
+<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a>data_other_prep <span class="ot">&lt;-</span> data_recipe <span class="sc">%&gt;%</span></span>
+<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a>  <span class="fu">prep</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a>  <span class="fu">bake</span>(<span class="at">new_data =</span> <span class="cn">NULL</span>)</span>
+<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a><span class="do">## bake test data</span></span>
+<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a>data_test_prep <span class="ot">&lt;-</span> data_recipe <span class="sc">%&gt;%</span></span>
+<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a>  <span class="fu">prep</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a>  <span class="fu">bake</span>(<span class="at">new_data =</span> data_test)</span>
+<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a><span class="co"># preview baked data</span></span>
+<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a><span class="fu">print</span>(<span class="fu">head</span>(data_other_prep))</span>
+<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a><span class="do">## # A tibble: 6 × 17</span></span>
+<span id="cb3-36"><a href="#cb3-36" aria-hidden="true" tabindex="-1"></a><span class="do">##      id   chol    glu    hdl  ratio  glyhb    age  height  bp.1s  bp.1d    hip</span></span>
+<span id="cb3-37"><a href="#cb3-37" aria-hidden="true" tabindex="-1"></a><span class="do">##   &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;   &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;</span></span>
+<span id="cb3-38"><a href="#cb3-38" aria-hidden="true" tabindex="-1"></a><span class="do">## 1  1045 -0.319 -0.539 -0.830  0.486 -0.172 -0.645 -0.474  -1.16  -0.552 -1.66 </span></span>
+<span id="cb3-39"><a href="#cb3-39" aria-hidden="true" tabindex="-1"></a><span class="do">## 2  1271  0.449 -1.09  -0.324  0.320 -0.458 -1.39  -1.29   -1.59  -1.00  -0.914</span></span>
+<span id="cb3-40"><a href="#cb3-40" aria-hidden="true" tabindex="-1"></a><span class="do">## 3  1277 -0.657 -0.572  2.32  -1.45  -0.642 -0.337  1.56    0.286  2.15  -1.29 </span></span>
+<span id="cb3-41"><a href="#cb3-41" aria-hidden="true" tabindex="-1"></a><span class="do">## 4  1303 -0.590 -0.410 -0.436 -0.178 -0.518  0.341  0.545  -0.309  0.500 -1.47 </span></span>
+<span id="cb3-42"><a href="#cb3-42" aria-hidden="true" tabindex="-1"></a><span class="do">## 5  1309 -0.206 -0.347  0.687 -0.730 -0.860 -1.32   0.0357 -0.735 -0.402 -1.66 </span></span>
+<span id="cb3-43"><a href="#cb3-43" aria-hidden="true" tabindex="-1"></a><span class="do">## 6  1315 -0.793 -0.572  0.350 -0.841  0.225  0.649  1.26   -0.565 -1.45  -1.29 </span></span>
+<span id="cb3-44"><a href="#cb3-44" aria-hidden="true" tabindex="-1"></a><span class="do">## # ℹ 6 more variables: time.ppn &lt;dbl&gt;, BMI &lt;dbl&gt;, location_Louisa &lt;dbl&gt;,</span></span>
+<span id="cb3-45"><a href="#cb3-45" aria-hidden="true" tabindex="-1"></a><span class="do">## #   gender_male &lt;dbl&gt;, frame_medium &lt;dbl&gt;, frame_small &lt;dbl&gt;</span></span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+</div>
+</section>
+<section id="lasso-regression" class="level2" data-number="2.4">
+<h2 data-number="2.4" class="anchored" data-anchor-id="lasso-regression"><span class="header-section-number">2.4</span> Lasso regression</h2>
+<div class="cell" data-fig-cap-location="margin">
+<div class="sourceCode cell-code" id="cb4"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="co"># define model</span></span>
+<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>model <span class="ot">&lt;-</span> <span class="fu">linear_reg</span>(<span class="at">penalty =</span> <span class="fu">tune</span>(), <span class="at">mixture =</span> <span class="dv">1</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>  <span class="fu">set_engine</span>(<span class="st">"glmnet"</span>) <span class="sc">%&gt;%</span></span>
+<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>  <span class="fu">set_mode</span>(<span class="st">"regression"</span>)</span>
+<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a><span class="co"># create workflow with data recipe and model </span></span>
+<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>wf <span class="ot">&lt;-</span> <span class="fu">workflow</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>  <span class="fu">add_model</span>(model) <span class="sc">%&gt;%</span></span>
+<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>  <span class="fu">add_recipe</span>(data_recipe)</span>
+<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a><span class="co"># define parameters range for tuning</span></span>
+<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a>grid_lambda <span class="ot">&lt;-</span> <span class="fu">grid_regular</span>(<span class="fu">penalty</span>(), <span class="at">levels =</span> <span class="dv">25</span>)</span>
+<span id="cb4-13"><a href="#cb4-13" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-14"><a href="#cb4-14" aria-hidden="true" tabindex="-1"></a><span class="co"># tune lambda</span></span>
+<span id="cb4-15"><a href="#cb4-15" aria-hidden="true" tabindex="-1"></a>model_tune <span class="ot">&lt;-</span> wf <span class="sc">%&gt;%</span></span>
+<span id="cb4-16"><a href="#cb4-16" aria-hidden="true" tabindex="-1"></a>  <span class="fu">tune_grid</span>(<span class="at">resamples =</span> data_folds, </span>
+<span id="cb4-17"><a href="#cb4-17" aria-hidden="true" tabindex="-1"></a>            <span class="at">grid =</span> grid_lambda)</span>
+<span id="cb4-18"><a href="#cb4-18" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-19"><a href="#cb4-19" aria-hidden="true" tabindex="-1"></a><span class="co"># show metrics average across folds</span></span>
+<span id="cb4-20"><a href="#cb4-20" aria-hidden="true" tabindex="-1"></a>model_tune  <span class="sc">%&gt;%</span></span>
+<span id="cb4-21"><a href="#cb4-21" aria-hidden="true" tabindex="-1"></a>  <span class="fu">collect_metrics</span>(<span class="at">summarize =</span> <span class="cn">TRUE</span>)</span>
+<span id="cb4-22"><a href="#cb4-22" aria-hidden="true" tabindex="-1"></a><span class="do">## # A tibble: 50 × 7</span></span>
+<span id="cb4-23"><a href="#cb4-23" aria-hidden="true" tabindex="-1"></a><span class="do">##     penalty .metric .estimator  mean     n std_err .config              </span></span>
+<span id="cb4-24"><a href="#cb4-24" aria-hidden="true" tabindex="-1"></a><span class="do">##       &lt;dbl&gt; &lt;chr&gt;   &lt;chr&gt;      &lt;dbl&gt; &lt;int&gt;   &lt;dbl&gt; &lt;chr&gt;                </span></span>
+<span id="cb4-25"><a href="#cb4-25" aria-hidden="true" tabindex="-1"></a><span class="do">##  1 1   e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model01</span></span>
+<span id="cb4-26"><a href="#cb4-26" aria-hidden="true" tabindex="-1"></a><span class="do">##  2 1   e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model01</span></span>
+<span id="cb4-27"><a href="#cb4-27" aria-hidden="true" tabindex="-1"></a><span class="do">##  3 2.61e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model02</span></span>
+<span id="cb4-28"><a href="#cb4-28" aria-hidden="true" tabindex="-1"></a><span class="do">##  4 2.61e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model02</span></span>
+<span id="cb4-29"><a href="#cb4-29" aria-hidden="true" tabindex="-1"></a><span class="do">##  5 6.81e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model03</span></span>
+<span id="cb4-30"><a href="#cb4-30" aria-hidden="true" tabindex="-1"></a><span class="do">##  6 6.81e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model03</span></span>
+<span id="cb4-31"><a href="#cb4-31" aria-hidden="true" tabindex="-1"></a><span class="do">##  7 1.78e- 9 rmse    standard   2.48     15  0.142  Preprocessor1_Model04</span></span>
+<span id="cb4-32"><a href="#cb4-32" aria-hidden="true" tabindex="-1"></a><span class="do">##  8 1.78e- 9 rsq     standard   0.851    15  0.0143 Preprocessor1_Model04</span></span>
+<span id="cb4-33"><a href="#cb4-33" aria-hidden="true" tabindex="-1"></a><span class="do">##  9 4.64e- 9 rmse    standard   2.48     15  0.142  Preprocessor1_Model05</span></span>
+<span id="cb4-34"><a href="#cb4-34" aria-hidden="true" tabindex="-1"></a><span class="do">## 10 4.64e- 9 rsq     standard   0.851    15  0.0143 Preprocessor1_Model05</span></span>
+<span id="cb4-35"><a href="#cb4-35" aria-hidden="true" tabindex="-1"></a><span class="do">## # ℹ 40 more rows</span></span>
+<span id="cb4-36"><a href="#cb4-36" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-37"><a href="#cb4-37" aria-hidden="true" tabindex="-1"></a><span class="co"># plot k-folds results across lambda range</span></span>
+<span id="cb4-38"><a href="#cb4-38" aria-hidden="true" tabindex="-1"></a>model_tune <span class="sc">%&gt;%</span></span>
+<span id="cb4-39"><a href="#cb4-39" aria-hidden="true" tabindex="-1"></a>  <span class="fu">collect_metrics</span>() <span class="sc">%&gt;%</span> </span>
+<span id="cb4-40"><a href="#cb4-40" aria-hidden="true" tabindex="-1"></a>  dplyr<span class="sc">::</span><span class="fu">filter</span>(.metric <span class="sc">==</span> <span class="st">"rmse"</span>) <span class="sc">%&gt;%</span> </span>
+<span id="cb4-41"><a href="#cb4-41" aria-hidden="true" tabindex="-1"></a>  <span class="fu">ggplot</span>(<span class="fu">aes</span>(penalty, mean, <span class="at">color =</span> .metric)) <span class="sc">+</span></span>
+<span id="cb4-42"><a href="#cb4-42" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_errorbar</span>(<span class="fu">aes</span>( <span class="at">ymin =</span> mean <span class="sc">-</span> std_err, <span class="at">ymax =</span> mean <span class="sc">+</span> std_err), <span class="at">alpha =</span> <span class="fl">0.5</span>) <span class="sc">+</span></span>
+<span id="cb4-43"><a href="#cb4-43" aria-hidden="true" tabindex="-1"></a>  <span class="fu">scale_x_log10</span>() <span class="sc">+</span> </span>
+<span id="cb4-44"><a href="#cb4-44" aria-hidden="true" tabindex="-1"></a>  <span class="fu">geom_line</span>(<span class="at">linewidth =</span> <span class="fl">1.5</span>) <span class="sc">+</span></span>
+<span id="cb4-45"><a href="#cb4-45" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme_bw</span>() <span class="sc">+</span></span>
+<span id="cb4-46"><a href="#cb4-46" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme</span>(<span class="at">legend.position =</span> <span class="st">"none"</span>) <span class="sc">+</span></span>
+<span id="cb4-47"><a href="#cb4-47" aria-hidden="true" tabindex="-1"></a>  <span class="fu">scale_color_brewer</span>(<span class="at">palette =</span> <span class="st">"Set1"</span>)</span>
+<span id="cb4-48"><a href="#cb4-48" aria-hidden="true" tabindex="-1"></a>  </span>
+<span id="cb4-49"><a href="#cb4-49" aria-hidden="true" tabindex="-1"></a><span class="co"># best lambda value (min. RMSE)</span></span>
+<span id="cb4-50"><a href="#cb4-50" aria-hidden="true" tabindex="-1"></a>model_best <span class="ot">&lt;-</span> model_tune <span class="sc">%&gt;%</span></span>
+<span id="cb4-51"><a href="#cb4-51" aria-hidden="true" tabindex="-1"></a>  <span class="fu">select_best</span>(<span class="at">metric =</span> <span class="st">"rmse"</span>)</span>
+<span id="cb4-52"><a href="#cb4-52" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-53"><a href="#cb4-53" aria-hidden="true" tabindex="-1"></a><span class="fu">print</span>(model_best)</span>
+<span id="cb4-54"><a href="#cb4-54" aria-hidden="true" tabindex="-1"></a><span class="do">## # A tibble: 1 × 2</span></span>
+<span id="cb4-55"><a href="#cb4-55" aria-hidden="true" tabindex="-1"></a><span class="do">##   penalty .config              </span></span>
+<span id="cb4-56"><a href="#cb4-56" aria-hidden="true" tabindex="-1"></a><span class="do">##     &lt;dbl&gt; &lt;chr&gt;                </span></span>
+<span id="cb4-57"><a href="#cb4-57" aria-hidden="true" tabindex="-1"></a><span class="do">## 1  0.0562 Preprocessor1_Model22</span></span>
+<span id="cb4-58"><a href="#cb4-58" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-59"><a href="#cb4-59" aria-hidden="true" tabindex="-1"></a><span class="co"># finalize workflow with tuned model</span></span>
+<span id="cb4-60"><a href="#cb4-60" aria-hidden="true" tabindex="-1"></a>wf_final <span class="ot">&lt;-</span> wf <span class="sc">%&gt;%</span></span>
+<span id="cb4-61"><a href="#cb4-61" aria-hidden="true" tabindex="-1"></a>  <span class="fu">finalize_workflow</span>(model_best)</span>
+<span id="cb4-62"><a href="#cb4-62" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-63"><a href="#cb4-63" aria-hidden="true" tabindex="-1"></a><span class="co"># last fit </span></span>
+<span id="cb4-64"><a href="#cb4-64" aria-hidden="true" tabindex="-1"></a>fit_final <span class="ot">&lt;-</span> wf_final <span class="sc">%&gt;%</span></span>
+<span id="cb4-65"><a href="#cb4-65" aria-hidden="true" tabindex="-1"></a>  <span class="fu">last_fit</span>(<span class="at">split =</span> data_split)</span>
+<span id="cb4-66"><a href="#cb4-66" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-67"><a href="#cb4-67" aria-hidden="true" tabindex="-1"></a><span class="co"># final predictions</span></span>
+<span id="cb4-68"><a href="#cb4-68" aria-hidden="true" tabindex="-1"></a>y_test_pred <span class="ot">&lt;-</span> fit_final <span class="sc">%&gt;%</span> <span class="fu">collect_predictions</span>() <span class="co"># predicted BMI</span></span>
+<span id="cb4-69"><a href="#cb4-69" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-70"><a href="#cb4-70" aria-hidden="true" tabindex="-1"></a><span class="co"># final predictions: performance on test (unseen data)</span></span>
+<span id="cb4-71"><a href="#cb4-71" aria-hidden="true" tabindex="-1"></a>fit_final <span class="sc">%&gt;%</span> <span class="fu">collect_metrics</span>() </span>
+<span id="cb4-72"><a href="#cb4-72" aria-hidden="true" tabindex="-1"></a><span class="do">## # A tibble: 2 × 4</span></span>
+<span id="cb4-73"><a href="#cb4-73" aria-hidden="true" tabindex="-1"></a><span class="do">##   .metric .estimator .estimate .config             </span></span>
+<span id="cb4-74"><a href="#cb4-74" aria-hidden="true" tabindex="-1"></a><span class="do">##   &lt;chr&gt;   &lt;chr&gt;          &lt;dbl&gt; &lt;chr&gt;               </span></span>
+<span id="cb4-75"><a href="#cb4-75" aria-hidden="true" tabindex="-1"></a><span class="do">## 1 rmse    standard       2.79  Preprocessor1_Model1</span></span>
+<span id="cb4-76"><a href="#cb4-76" aria-hidden="true" tabindex="-1"></a><span class="do">## 2 rsq     standard       0.857 Preprocessor1_Model1</span></span>
+<span id="cb4-77"><a href="#cb4-77" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-78"><a href="#cb4-78" aria-hidden="true" tabindex="-1"></a><span class="co"># plot predictions vs. actual for test data</span></span>
+<span id="cb4-79"><a href="#cb4-79" aria-hidden="true" tabindex="-1"></a><span class="fu">plot</span>(data_test<span class="sc">$</span>BMI, y_test_pred<span class="sc">$</span>.pred, <span class="at">xlab=</span><span class="st">"BMI (actual)"</span>, <span class="at">ylab =</span> <span class="st">"BMI (predicted)"</span>, <span class="at">las =</span> <span class="dv">1</span>, <span class="at">pch =</span> <span class="dv">19</span>)</span>
+<span id="cb4-80"><a href="#cb4-80" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-81"><a href="#cb4-81" aria-hidden="true" tabindex="-1"></a><span class="co"># correlation between predicted and actual BMI values for test data</span></span>
+<span id="cb4-82"><a href="#cb4-82" aria-hidden="true" tabindex="-1"></a><span class="fu">cor</span>(data_test<span class="sc">$</span>BMI, y_test_pred<span class="sc">$</span>.pred)</span>
+<span id="cb4-83"><a href="#cb4-83" aria-hidden="true" tabindex="-1"></a><span class="do">## [1] 0.9256857</span></span>
+<span id="cb4-84"><a href="#cb4-84" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-85"><a href="#cb4-85" aria-hidden="true" tabindex="-1"></a><span class="co"># re-fit model on all non-test data</span></span>
+<span id="cb4-86"><a href="#cb4-86" aria-hidden="true" tabindex="-1"></a>model_final <span class="ot">&lt;-</span> wf_final <span class="sc">%&gt;%</span></span>
+<span id="cb4-87"><a href="#cb4-87" aria-hidden="true" tabindex="-1"></a>  <span class="fu">fit</span>(data_other) </span>
+<span id="cb4-88"><a href="#cb4-88" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-89"><a href="#cb4-89" aria-hidden="true" tabindex="-1"></a><span class="co"># show final model</span></span>
+<span id="cb4-90"><a href="#cb4-90" aria-hidden="true" tabindex="-1"></a><span class="fu">tidy</span>(model_final)</span>
+<span id="cb4-91"><a href="#cb4-91" aria-hidden="true" tabindex="-1"></a><span class="do">## # A tibble: 16 × 3</span></span>
+<span id="cb4-92"><a href="#cb4-92" aria-hidden="true" tabindex="-1"></a><span class="do">##    term            estimate penalty</span></span>
+<span id="cb4-93"><a href="#cb4-93" aria-hidden="true" tabindex="-1"></a><span class="do">##    &lt;chr&gt;              &lt;dbl&gt;   &lt;dbl&gt;</span></span>
+<span id="cb4-94"><a href="#cb4-94" aria-hidden="true" tabindex="-1"></a><span class="do">##  1 (Intercept)      28.7     0.0562</span></span>
+<span id="cb4-95"><a href="#cb4-95" aria-hidden="true" tabindex="-1"></a><span class="do">##  2 chol              0       0.0562</span></span>
+<span id="cb4-96"><a href="#cb4-96" aria-hidden="true" tabindex="-1"></a><span class="do">##  3 glu               0.0229  0.0562</span></span>
+<span id="cb4-97"><a href="#cb4-97" aria-hidden="true" tabindex="-1"></a><span class="do">##  4 hdl               0       0.0562</span></span>
+<span id="cb4-98"><a href="#cb4-98" aria-hidden="true" tabindex="-1"></a><span class="do">##  5 ratio             0.335   0.0562</span></span>
+<span id="cb4-99"><a href="#cb4-99" aria-hidden="true" tabindex="-1"></a><span class="do">##  6 glyhb            -0.0512  0.0562</span></span>
+<span id="cb4-100"><a href="#cb4-100" aria-hidden="true" tabindex="-1"></a><span class="do">##  7 age              -0.257   0.0562</span></span>
+<span id="cb4-101"><a href="#cb4-101" aria-hidden="true" tabindex="-1"></a><span class="do">##  8 height           -1.86    0.0562</span></span>
+<span id="cb4-102"><a href="#cb4-102" aria-hidden="true" tabindex="-1"></a><span class="do">##  9 bp.1s            -0.294   0.0562</span></span>
+<span id="cb4-103"><a href="#cb4-103" aria-hidden="true" tabindex="-1"></a><span class="do">## 10 bp.1d             0.203   0.0562</span></span>
+<span id="cb4-104"><a href="#cb4-104" aria-hidden="true" tabindex="-1"></a><span class="do">## 11 hip               5.60    0.0562</span></span>
+<span id="cb4-105"><a href="#cb4-105" aria-hidden="true" tabindex="-1"></a><span class="do">## 12 time.ppn          0       0.0562</span></span>
+<span id="cb4-106"><a href="#cb4-106" aria-hidden="true" tabindex="-1"></a><span class="do">## 13 location_Louisa  -0.160   0.0562</span></span>
+<span id="cb4-107"><a href="#cb4-107" aria-hidden="true" tabindex="-1"></a><span class="do">## 14 gender_male       1.07    0.0562</span></span>
+<span id="cb4-108"><a href="#cb4-108" aria-hidden="true" tabindex="-1"></a><span class="do">## 15 frame_medium     -0.320   0.0562</span></span>
+<span id="cb4-109"><a href="#cb4-109" aria-hidden="true" tabindex="-1"></a><span class="do">## 16 frame_small      -0.530   0.0562</span></span>
+<span id="cb4-110"><a href="#cb4-110" aria-hidden="true" tabindex="-1"></a></span>
+<span id="cb4-111"><a href="#cb4-111" aria-hidden="true" tabindex="-1"></a><span class="co"># plot variables ordered by importance (highest abs(coeff))</span></span>
+<span id="cb4-112"><a href="#cb4-112" aria-hidden="true" tabindex="-1"></a>model_final <span class="sc">%&gt;%</span></span>
+<span id="cb4-113"><a href="#cb4-113" aria-hidden="true" tabindex="-1"></a>  <span class="fu">extract_fit_parsnip</span>() <span class="sc">%&gt;%</span></span>
+<span id="cb4-114"><a href="#cb4-114" aria-hidden="true" tabindex="-1"></a>  <span class="fu">vip</span>(<span class="at">geom =</span> <span class="st">"point"</span>) <span class="sc">+</span> </span>
+<span id="cb4-115"><a href="#cb4-115" aria-hidden="true" tabindex="-1"></a>  <span class="fu">theme_bw</span>()</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/unnamed-chunk-4-1.png" class="img-fluid figure-img" width="576"></p>
+<figcaption class="figure-caption">Mean RMSE plus/minus standard error across repeated cross-validation folds as a function of lambda values</figcaption>
+</figure>
+</div>
+</div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/unnamed-chunk-4-2.png" class="img-fluid figure-img" width="576"></p>
+<figcaption class="figure-caption">Predicted BMI values against actual BMI values using final model for predicting test (unseen) data</figcaption>
+</figure>
+</div>
+</div>
+<div class="cell-output-display">
+<div class="quarto-figure quarto-figure-center">
+<figure class="figure">
+<p><img src="case-study_files/figure-html/unnamed-chunk-4-3.png" class="img-fluid figure-img" width="576"></p>
+<figcaption class="figure-caption">Top feature of importance, here measured as the features with highest absolute value of the Lasso regression coefficients from the final tuned model</figcaption>
+</figure>
+</div>
+</div>
+</div>
+
+
+</section>
+
+</main> <!-- /main -->
+<script id="quarto-html-after-body" type="application/javascript">
+window.document.addEventListener("DOMContentLoaded", function (event) {
+  const toggleBodyColorMode = (bsSheetEl) => {
+    const mode = bsSheetEl.getAttribute("data-mode");
+    const bodyEl = window.document.querySelector("body");
+    if (mode === "dark") {
+      bodyEl.classList.add("quarto-dark");
+      bodyEl.classList.remove("quarto-light");
+    } else {
+      bodyEl.classList.add("quarto-light");
+      bodyEl.classList.remove("quarto-dark");
+    }
+  }
+  const toggleBodyColorPrimary = () => {
+    const bsSheetEl = window.document.querySelector("link#quarto-bootstrap");
+    if (bsSheetEl) {
+      toggleBodyColorMode(bsSheetEl);
+    }
+  }
+  toggleBodyColorPrimary();  
+  const icon = "";
+  const anchorJS = new window.AnchorJS();
+  anchorJS.options = {
+    placement: 'right',
+    icon: icon
+  };
+  anchorJS.add('.anchored');
+  const isCodeAnnotation = (el) => {
+    for (const clz of el.classList) {
+      if (clz.startsWith('code-annotation-')) {                     
+        return true;
+      }
+    }
+    return false;
+  }
+  const clipboard = new window.ClipboardJS('.code-copy-button', {
+    text: function(trigger) {
+      const codeEl = trigger.previousElementSibling.cloneNode(true);
+      for (const childEl of codeEl.children) {
+        if (isCodeAnnotation(childEl)) {
+          childEl.remove();
+        }
+      }
+      return codeEl.innerText;
+    }
+  });
+  clipboard.on('success', function(e) {
+    // button target
+    const button = e.trigger;
+    // don't keep focus
+    button.blur();
+    // flash "checked"
+    button.classList.add('code-copy-button-checked');
+    var currentTitle = button.getAttribute("title");
+    button.setAttribute("title", "Copied!");
+    let tooltip;
+    if (window.bootstrap) {
+      button.setAttribute("data-bs-toggle", "tooltip");
+      button.setAttribute("data-bs-placement", "left");
+      button.setAttribute("data-bs-title", "Copied!");
+      tooltip = new bootstrap.Tooltip(button, 
+        { trigger: "manual", 
+          customClass: "code-copy-button-tooltip",
+          offset: [0, -8]});
+      tooltip.show();    
+    }
+    setTimeout(function() {
+      if (tooltip) {
+        tooltip.hide();
+        button.removeAttribute("data-bs-title");
+        button.removeAttribute("data-bs-toggle");
+        button.removeAttribute("data-bs-placement");
+      }
+      button.setAttribute("title", currentTitle);
+      button.classList.remove('code-copy-button-checked');
+    }, 1000);
+    // clear code selection
+    e.clearSelection();
+  });
+  function tippyHover(el, contentFn) {
+    const config = {
+      allowHTML: true,
+      content: contentFn,
+      maxWidth: 500,
+      delay: 100,
+      arrow: false,
+      appendTo: function(el) {
+          return el.parentElement;
+      },
+      interactive: true,
+      interactiveBorder: 10,
+      theme: 'quarto',
+      placement: 'bottom-start'
+    };
+    window.tippy(el, config); 
+  }
+  const noterefs = window.document.querySelectorAll('a[role="doc-noteref"]');
+  for (var i=0; i<noterefs.length; i++) {
+    const ref = noterefs[i];
+    tippyHover(ref, function() {
+      // use id or data attribute instead here
+      let href = ref.getAttribute('data-footnote-href') || ref.getAttribute('href');
+      try { href = new URL(href).hash; } catch {}
+      const id = href.replace(/^#\/?/, "");
+      const note = window.document.getElementById(id);
+      return note.innerHTML;
+    });
+  }
+      let selectedAnnoteEl;
+      const selectorForAnnotation = ( cell, annotation) => {
+        let cellAttr = 'data-code-cell="' + cell + '"';
+        let lineAttr = 'data-code-annotation="' +  annotation + '"';
+        const selector = 'span[' + cellAttr + '][' + lineAttr + ']';
+        return selector;
+      }
+      const selectCodeLines = (annoteEl) => {
+        const doc = window.document;
+        const targetCell = annoteEl.getAttribute("data-target-cell");
+        const targetAnnotation = annoteEl.getAttribute("data-target-annotation");
+        const annoteSpan = window.document.querySelector(selectorForAnnotation(targetCell, targetAnnotation));
+        const lines = annoteSpan.getAttribute("data-code-lines").split(",");
+        const lineIds = lines.map((line) => {
+          return targetCell + "-" + line;
+        })
+        let top = null;
+        let height = null;
+        let parent = null;
+        if (lineIds.length > 0) {
+            //compute the position of the single el (top and bottom and make a div)
+            const el = window.document.getElementById(lineIds[0]);
+            top = el.offsetTop;
+            height = el.offsetHeight;
+            parent = el.parentElement.parentElement;
+          if (lineIds.length > 1) {
+            const lastEl = window.document.getElementById(lineIds[lineIds.length - 1]);
+            const bottom = lastEl.offsetTop + lastEl.offsetHeight;
+            height = bottom - top;
+          }
+          if (top !== null && height !== null && parent !== null) {
+            // cook up a div (if necessary) and position it 
+            let div = window.document.getElementById("code-annotation-line-highlight");
+            if (div === null) {
+              div = window.document.createElement("div");
+              div.setAttribute("id", "code-annotation-line-highlight");
+              div.style.position = 'absolute';
+              parent.appendChild(div);
+            }
+            div.style.top = top - 2 + "px";
+            div.style.height = height + 4 + "px";
+            let gutterDiv = window.document.getElementById("code-annotation-line-highlight-gutter");
+            if (gutterDiv === null) {
+              gutterDiv = window.document.createElement("div");
+              gutterDiv.setAttribute("id", "code-annotation-line-highlight-gutter");
+              gutterDiv.style.position = 'absolute';
+              const codeCell = window.document.getElementById(targetCell);
+              const gutter = codeCell.querySelector('.code-annotation-gutter');
+              gutter.appendChild(gutterDiv);
+            }
+            gutterDiv.style.top = top - 2 + "px";
+            gutterDiv.style.height = height + 4 + "px";
+          }
+          selectedAnnoteEl = annoteEl;
+        }
+      };
+      const unselectCodeLines = () => {
+        const elementsIds = ["code-annotation-line-highlight", "code-annotation-line-highlight-gutter"];
+        elementsIds.forEach((elId) => {
+          const div = window.document.getElementById(elId);
+          if (div) {
+            div.remove();
+          }
+        });
+        selectedAnnoteEl = undefined;
+      };
+      // Attach click handler to the DT
+      const annoteDls = window.document.querySelectorAll('dt[data-target-cell]');
+      for (const annoteDlNode of annoteDls) {
+        annoteDlNode.addEventListener('click', (event) => {
+          const clickedEl = event.target;
+          if (clickedEl !== selectedAnnoteEl) {
+            unselectCodeLines();
+            const activeEl = window.document.querySelector('dt[data-target-cell].code-annotation-active');
+            if (activeEl) {
+              activeEl.classList.remove('code-annotation-active');
+            }
+            selectCodeLines(clickedEl);
+            clickedEl.classList.add('code-annotation-active');
+          } else {
+            // Unselect the line
+            unselectCodeLines();
+            clickedEl.classList.remove('code-annotation-active');
+          }
+        });
+      }
+  const findCites = (el) => {
+    const parentEl = el.parentElement;
+    if (parentEl) {
+      const cites = parentEl.dataset.cites;
+      if (cites) {
+        return {
+          el,
+          cites: cites.split(' ')
+        };
+      } else {
+        return findCites(el.parentElement)
+      }
+    } else {
+      return undefined;
+    }
+  };
+  var bibliorefs = window.document.querySelectorAll('a[role="doc-biblioref"]');
+  for (var i=0; i<bibliorefs.length; i++) {
+    const ref = bibliorefs[i];
+    const citeInfo = findCites(ref);
+    if (citeInfo) {
+      tippyHover(citeInfo.el, function() {
+        var popup = window.document.createElement('div');
+        citeInfo.cites.forEach(function(cite) {
+          var citeDiv = window.document.createElement('div');
+          citeDiv.classList.add('hanging-indent');
+          citeDiv.classList.add('csl-entry');
+          var biblioDiv = window.document.getElementById('ref-' + cite);
+          if (biblioDiv) {
+            citeDiv.innerHTML = biblioDiv.innerHTML;
+          }
+          popup.appendChild(citeDiv);
+        });
+        return popup.innerHTML;
+      });
+    }
+  }
+});
+</script>
+<nav class="page-navigation">
+  <div class="nav-page nav-page-previous">
+      <a href="./intro.html" class="pagination-link">
+        <i class="bi bi-arrow-left-short"></i> <span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></span>
+      </a>          
+  </div>
+  <div class="nav-page nav-page-next">
+  </div>
+</nav>
+</div> <!-- /content -->
+
+
+
+</body></html>
\ No newline at end of file
diff --git a/session-tidymodels/case-study_files/figure-html/data-split-1.png b/session-tidymodels/docs/case-study_files/figure-html/data-split-1.png
similarity index 100%
rename from session-tidymodels/case-study_files/figure-html/data-split-1.png
rename to session-tidymodels/docs/case-study_files/figure-html/data-split-1.png
diff --git a/session-tidymodels/case-study_files/figure-html/load-data-1.png b/session-tidymodels/docs/case-study_files/figure-html/load-data-1.png
similarity index 100%
rename from session-tidymodels/case-study_files/figure-html/load-data-1.png
rename to session-tidymodels/docs/case-study_files/figure-html/load-data-1.png
diff --git a/session-tidymodels/case-study_files/figure-html/load-data-2.png b/session-tidymodels/docs/case-study_files/figure-html/load-data-2.png
similarity index 100%
rename from session-tidymodels/case-study_files/figure-html/load-data-2.png
rename to session-tidymodels/docs/case-study_files/figure-html/load-data-2.png
diff --git a/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-1.png b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-1.png
new file mode 100644
index 0000000..3188bd1
Binary files /dev/null and b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-1.png differ
diff --git a/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-2.png b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-2.png
new file mode 100644
index 0000000..715bffe
Binary files /dev/null and b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-2.png differ
diff --git a/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-3.png b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-3.png
new file mode 100644
index 0000000..4371bca
Binary files /dev/null and b/session-tidymodels/docs/case-study_files/figure-html/unnamed-chunk-4-3.png differ
diff --git a/session-tidymodels/docs/index.html b/session-tidymodels/docs/index.html
index 6ff62b1..49330c9 100644
--- a/session-tidymodels/docs/index.html
+++ b/session-tidymodels/docs/index.html
@@ -108,7 +108,7 @@
         <li class="sidebar-item">
   <div class="sidebar-item-container"> 
   <a href="./intro.html" class="sidebar-item-text sidebar-link">
- <span class="menu-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Tidymodels</span></span></a>
+ <span class="menu-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></span></a>
   </div>
 </li>
         <li class="sidebar-item">
@@ -116,12 +116,6 @@
   <a href="./case-study.html" class="sidebar-item-text sidebar-link">
  <span class="menu-text"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Demo: a predictive modelling case study</span></span></a>
   </div>
-</li>
-        <li class="sidebar-item">
-  <div class="sidebar-item-container"> 
-  <a href="./exercises.html" class="sidebar-item-text sidebar-link">
- <span class="menu-text">Exercises</span></a>
-  </div>
 </li>
     </ul>
     </div>
@@ -168,6 +162,7 @@ <h1 class="unnumbered">Preface</h1>
 <p><strong>Aims</strong></p>
 <ul>
 <li>to introduce <code>tidymodels</code> framework for predictive modelling studies</li>
+<li>and show how to put all the common steps for building predictive model</li>
 </ul>
 <p><strong>Learning outcomes</strong></p>
 <ul>
@@ -418,7 +413,7 @@ <h1 class="unnumbered">Preface</h1>
   </div>
   <div class="nav-page nav-page-next">
       <a href="./intro.html" class="pagination-link">
-        <span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Tidymodels</span></span> <i class="bi bi-arrow-right-short"></i>
+        <span class="nav-page-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></span> <i class="bi bi-arrow-right-short"></i>
       </a>
   </div>
 </nav>
diff --git a/session-tidymodels/docs/intro.html b/session-tidymodels/docs/intro.html
index 8006d5c..e9c916c 100644
--- a/session-tidymodels/docs/intro.html
+++ b/session-tidymodels/docs/intro.html
@@ -7,7 +7,7 @@
 <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
 
 
-<title>Tidymodels - 1&nbsp; Tidymodels</title>
+<title>Tidymodels - 1&nbsp; Introduction to Tidymodels</title>
 <style>
 code{white-space: pre-wrap;}
 span.smallcaps{font-variant: small-caps;}
@@ -93,7 +93,7 @@
       <button type="button" class="quarto-btn-toggle btn" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar,#quarto-sidebar-glass" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">
         <i class="bi bi-layout-text-sidebar-reverse"></i>
       </button>
-      <nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="./intro.html"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Tidymodels</span></a></li></ol></nav>
+      <nav class="quarto-page-breadcrumbs" aria-label="breadcrumb"><ol class="breadcrumb"><li class="breadcrumb-item"><a href="./intro.html"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></a></li></ol></nav>
       <a class="flex-grow-1" role="button" data-bs-toggle="collapse" data-bs-target="#quarto-sidebar,#quarto-sidebar-glass" aria-controls="quarto-sidebar" aria-expanded="false" aria-label="Toggle sidebar navigation" onclick="if (window.quartoToggleHeadroom) { window.quartoToggleHeadroom(); }">      
       </a>
       <button type="button" class="btn quarto-search-button" aria-label="" onclick="window.quartoOpenSearch();">
@@ -127,7 +127,7 @@
         <li class="sidebar-item">
   <div class="sidebar-item-container"> 
   <a href="./intro.html" class="sidebar-item-text sidebar-link active">
- <span class="menu-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Tidymodels</span></span></a>
+ <span class="menu-text"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></span></a>
   </div>
 </li>
         <li class="sidebar-item">
@@ -135,12 +135,6 @@
   <a href="./case-study.html" class="sidebar-item-text sidebar-link">
  <span class="menu-text"><span class="chapter-number">2</span>&nbsp; <span class="chapter-title">Demo: a predictive modelling case study</span></span></a>
   </div>
-</li>
-        <li class="sidebar-item">
-  <div class="sidebar-item-container"> 
-  <a href="./exercises.html" class="sidebar-item-text sidebar-link">
- <span class="menu-text">Exercises</span></a>
-  </div>
 </li>
     </ul>
     </div>
@@ -161,7 +155,7 @@ <h2 id="toc-title">Table of contents</h2>
 
 <header id="title-block-header" class="quarto-title-block default">
 <div class="quarto-title">
-<h1 class="title"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Tidymodels</span></h1>
+<h1 class="title"><span class="chapter-number">1</span>&nbsp; <span class="chapter-title">Introduction to Tidymodels</span></h1>
 </div>
 
 
@@ -218,7 +212,7 @@ <h1 class="title"><span class="chapter-number">1</span>&nbsp; <span class="chapt
 </table>
 <section id="references" class="level2 unnumbered">
 <h2 class="unnumbered anchored" data-anchor-id="references">References</h2>
-<div id="refs" role="list">
+<div id="refs" role="list" style="display: none">
 
 </div>
 <!-- filtering non-informative features (variance threshold, univariate etc.) -->
diff --git a/session-tidymodels/docs/search.json b/session-tidymodels/docs/search.json
new file mode 100644
index 0000000..4d0eace
--- /dev/null
+++ b/session-tidymodels/docs/search.json
@@ -0,0 +1,44 @@
+[
+  {
+    "objectID": "index.html",
+    "href": "index.html",
+    "title": "Tidymodels",
+    "section": "",
+    "text": "Preface\nAims\n\nto introduce tidymodels framework for predictive modelling studies\nand show how to put all the common steps for building predictive model\n\nLearning outcomes\n\nto be able to use tidymodels framework for a complete workflow of supervised learning\n\nDo you see a mistake or a typo? We would be grateful if you let us know via edu.ml-biostats@nbis.se\nThis repository contains teaching and learning materials prepared for and used during “Introduction to biostatistics and Machine Learning” course, organized by NBIS, National Bioinformatics Infrastructure Sweden. The course is open for PhD students, postdoctoral researcher and other employees within Swedish universities. The materials are geared towards life scientists wanting to be able to understand and use the basic statistical and machine learning methods. More about the course https://nbisweden.github.io/workshop-mlbiostatistics/"
+  },
+  {
+    "objectID": "intro.html#references",
+    "href": "intro.html#references",
+    "title": "1  Introduction to Tidymodels",
+    "section": "References",
+    "text": "References"
+  },
+  {
+    "objectID": "case-study.html#data-import-eda",
+    "href": "case-study.html#data-import-eda",
+    "title": "2  Demo: a predictive modelling case study",
+    "section": "2.1 Data import & EDA",
+    "text": "2.1 Data import & EDA\n\n# load libraries\nlibrary(tidyverse)\nlibrary(tidymodels)\nlibrary(ggcorrplot)\nlibrary(reshape2)\nlibrary(vip)\n\n# import raw data\ninput_diabetes &lt;- read_csv(\"data/data-diabetes.csv\")\n\n# create BMI variable\nconv_factor &lt;- 703 # conversion factor to calculate BMI from inches and pounds BMI = weight (lb) / [height (in)]2 x 703\ndata_diabetes &lt;- input_diabetes %&gt;%\n  mutate(BMI = weight / height^2 * 703, BMI = round(BMI, 2)) %&gt;%\n  relocate(BMI, .after = id)\n\n# preview data\nglimpse(data_diabetes)\n## Rows: 403\n## Columns: 20\n## $ id       &lt;dbl&gt; 1000, 1001, 1002, 1003, 1005, 1008, 1011, 1015, 1016, 1022, 1…\n## $ BMI      &lt;dbl&gt; 22.13, 37.42, 48.37, 18.64, 27.82, 26.50, 28.20, 34.33, 24.51…\n## $ chol     &lt;dbl&gt; 203, 165, 228, 78, 249, 248, 195, 227, 177, 263, 242, 215, 23…\n## $ stab.glu &lt;dbl&gt; 82, 97, 92, 93, 90, 94, 92, 75, 87, 89, 82, 128, 75, 79, 76, …\n## $ hdl      &lt;dbl&gt; 56, 24, 37, 12, 28, 69, 41, 44, 49, 40, 54, 34, 36, 46, 30, 4…\n## $ ratio    &lt;dbl&gt; 3.6, 6.9, 6.2, 6.5, 8.9, 3.6, 4.8, 5.2, 3.6, 6.6, 4.5, 6.3, 6…\n## $ glyhb    &lt;dbl&gt; 4.31, 4.44, 4.64, 4.63, 7.72, 4.81, 4.84, 3.94, 4.84, 5.78, 4…\n## $ location &lt;chr&gt; \"Buckingham\", \"Buckingham\", \"Buckingham\", \"Buckingham\", \"Buck…\n## $ age      &lt;dbl&gt; 46, 29, 58, 67, 64, 34, 30, 37, 45, 55, 60, 38, 27, 40, 36, 3…\n## $ gender   &lt;chr&gt; \"female\", \"female\", \"female\", \"male\", \"male\", \"male\", \"male\",…\n## $ height   &lt;dbl&gt; 62, 64, 61, 67, 68, 71, 69, 59, 69, 63, 65, 58, 60, 59, 69, 6…\n## $ weight   &lt;dbl&gt; 121, 218, 256, 119, 183, 190, 191, 170, 166, 202, 156, 195, 1…\n## $ frame    &lt;chr&gt; \"medium\", \"large\", \"large\", \"large\", \"medium\", \"large\", \"medi…\n## $ bp.1s    &lt;dbl&gt; 118, 112, 190, 110, 138, 132, 161, NA, 160, 108, 130, 102, 13…\n## $ bp.1d    &lt;dbl&gt; 59, 68, 92, 50, 80, 86, 112, NA, 80, 72, 90, 68, 80, NA, 66, …\n## $ bp.2s    &lt;dbl&gt; NA, NA, 185, NA, NA, NA, 161, NA, 128, NA, 130, NA, NA, NA, N…\n## $ bp.2d    &lt;dbl&gt; NA, NA, 92, NA, NA, NA, 112, NA, 86, NA, 90, NA, NA, NA, NA, …\n## $ waist    &lt;dbl&gt; 29, 46, 49, 33, 44, 36, 46, 34, 34, 45, 39, 42, 35, 37, 36, 3…\n## $ hip      &lt;dbl&gt; 38, 48, 57, 38, 41, 42, 49, 39, 40, 50, 45, 50, 41, 43, 40, 4…\n## $ time.ppn &lt;dbl&gt; 720, 360, 180, 480, 300, 195, 720, 1020, 300, 240, 300, 90, 7…\n\n# run basic EDA\n# note: we have seen descriptive statistics and plots during EDA session \n# note: so here we only look at missing data and correlation\n\n# calculate number of missing data per variable\ndata_na &lt;- data_diabetes %&gt;% \n  summarise(across(everything(), ~ sum(is.na(.)))) \n\n# make a table with counts sorted from highest to lowest\ndata_na_long &lt;- data_na %&gt;%\n  pivot_longer(-id, names_to = \"variable\", values_to = \"count\") %&gt;%\n  arrange(desc(count)) \n\n# make a column plot to visualize the counts\ndata_na_long %&gt;%\n  ggplot(aes(x = variable, y = count)) + \n  geom_col(fill = \"blue4\") + \n  xlab(\"\") + \n  theme_bw() +\n  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust=1))\n\n# calculate correlation between numeric variables\ndata_cor &lt;- data_diabetes %&gt;% \n  dplyr::select(-id) %&gt;% \n  dplyr::select(where(is.numeric)) %&gt;%\n  cor(use = \"pairwise.complete.obs\")\n\n# visualize correlation via heatmap\nggcorrplot(data_cor, hc.order = TRUE, lab = FALSE)\n\n# based on the number of missing data, let's delete bp.2s, bp.2d\n# and use complete-cases analysis \ndata_diabetes_narm &lt;- data_diabetes %&gt;%\n  dplyr::select(-bp.2s, -bp.2d) %&gt;%\n  na.omit()\n\n\n\n\nNumber of missing data per variable, shows that bp.2d and bp.2s have more than 50% missing entries\n\n\n\n\n\n\n\nHeatmap visualizing Pearson correlation coefficient between numerical variables"
+  },
+  {
+    "objectID": "case-study.html#data-splitting",
+    "href": "case-study.html#data-splitting",
+    "title": "2  Demo: a predictive modelling case study",
+    "section": "2.2 Data splitting",
+    "text": "2.2 Data splitting\n\n# use tidymodels framework to fit Lasso regression model for predicting BMI\n# using repeated cross-validation to tune lambda value in L1 penalty term\n\n# select random seed value\nmyseed &lt;- 123\n\n# split data into non-test (other) and test (80% s)\nset.seed(myseed)\ndata_split &lt;- initial_split(data_diabetes_narm, strata = BMI, prop = 0.8) # holds splitting info\ndata_other &lt;- data_split %&gt;% training() # creates non-test set (function is called training but it refers to non-test part)\ndata_test &lt;- data_split %&gt;% testing() # creates test set\n\n# prepare repeated cross-validation splits with 5 folds repeated 3 times\nset.seed(myseed)\ndata_folds &lt;- vfold_cv(data_other,\n                       v = 5, \n                       repeats = 3,\n                       strata = BMI)\n\n# check the split\ndim(data_diabetes)\n## [1] 403  20\ndim(data_other)\n## [1] 291  18\ndim(data_test)\n## [1] 75 18\n\n# check BMI distributions in data splits\npar(mfrow=c(3,1))\nhist(data_diabetes$BMI, xlab = \"\", main = \"BMI: all\", 50)\nhist(data_other$BMI, xlab = \"\", main = \"BMI: non-test\", 50)\nhist(data_test$BMI, xlab = \"\", main = \"BMI: test\", 50)\n\n\n\n\nDistribution of BMI values given all data and spits into non-test and test"
+  },
+  {
+    "objectID": "case-study.html#feature-engineering",
+    "href": "case-study.html#feature-engineering",
+    "title": "2  Demo: a predictive modelling case study",
+    "section": "2.3 Feature engineering",
+    "text": "2.3 Feature engineering\n\n# create data recipe (feature engineering)\n\ninch2m &lt;- 2.54/100\npound2kg &lt;- 0.45\n\ndata_recipe &lt;- recipe(BMI ~ ., data = data_other) %&gt;%\n  update_role(id, new_role = \"sampleID\") %&gt;%\n  step_mutate(height = height * inch2m, height = round(height, 2)) %&gt;% # convert height to meters\n  step_mutate(weight = weight * pound2kg, weight = round(weight, 2)) %&gt;% # convert weight to kg\n  step_rename(glu = stab.glu) %&gt;% # rename stab.glu to glu\n  step_log(glu) %&gt;%  #ln transform glucose\n  step_zv(all_numeric()) %&gt;% # removes variables that are highly sparse and unbalanced (if found)\n  step_corr(all_numeric(), -all_outcomes(), -has_role(\"sampleID\"), threshold = 0.8) %&gt;% # removes variables with large absolute correlations with other variables (if found)\n  step_dummy(location, gender, frame) %&gt;% # convert categorical variables to dummy variables\n  step_normalize(all_numeric(), -all_outcomes(), -has_role(\"sampleID\"), skip = FALSE) \n  \n  # you can implement more steps: see https://recipes.tidymodels.org/reference/index.html\n\n# print recipe\ndata_recipe\n\n# check if recipe is doing what it is supposed to do\n# i.e. bake the data\ndata_other_prep &lt;- data_recipe %&gt;%\n  prep() %&gt;%\n  bake(new_data = NULL)\n\n## bake test data\ndata_test_prep &lt;- data_recipe %&gt;%\n  prep() %&gt;%\n  bake(new_data = data_test)\n\n# preview baked data\nprint(head(data_other_prep))\n## # A tibble: 6 × 17\n##      id   chol    glu    hdl  ratio  glyhb    age  height  bp.1s  bp.1d    hip\n##   &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;   &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;  &lt;dbl&gt;\n## 1  1045 -0.319 -0.539 -0.830  0.486 -0.172 -0.645 -0.474  -1.16  -0.552 -1.66 \n## 2  1271  0.449 -1.09  -0.324  0.320 -0.458 -1.39  -1.29   -1.59  -1.00  -0.914\n## 3  1277 -0.657 -0.572  2.32  -1.45  -0.642 -0.337  1.56    0.286  2.15  -1.29 \n## 4  1303 -0.590 -0.410 -0.436 -0.178 -0.518  0.341  0.545  -0.309  0.500 -1.47 \n## 5  1309 -0.206 -0.347  0.687 -0.730 -0.860 -1.32   0.0357 -0.735 -0.402 -1.66 \n## 6  1315 -0.793 -0.572  0.350 -0.841  0.225  0.649  1.26   -0.565 -1.45  -1.29 \n## # ℹ 6 more variables: time.ppn &lt;dbl&gt;, BMI &lt;dbl&gt;, location_Louisa &lt;dbl&gt;,\n## #   gender_male &lt;dbl&gt;, frame_medium &lt;dbl&gt;, frame_small &lt;dbl&gt;"
+  },
+  {
+    "objectID": "case-study.html#lasso-regression",
+    "href": "case-study.html#lasso-regression",
+    "title": "2  Demo: a predictive modelling case study",
+    "section": "2.4 Lasso regression",
+    "text": "2.4 Lasso regression\n\n# define model\nmodel &lt;- linear_reg(penalty = tune(), mixture = 1) %&gt;%\n  set_engine(\"glmnet\") %&gt;%\n  set_mode(\"regression\")\n\n# create workflow with data recipe and model \nwf &lt;- workflow() %&gt;%\n  add_model(model) %&gt;%\n  add_recipe(data_recipe)\n\n# define parameters range for tuning\ngrid_lambda &lt;- grid_regular(penalty(), levels = 25)\n\n# tune lambda\nmodel_tune &lt;- wf %&gt;%\n  tune_grid(resamples = data_folds, \n            grid = grid_lambda)\n\n# show metrics average across folds\nmodel_tune  %&gt;%\n  collect_metrics(summarize = TRUE)\n## # A tibble: 50 × 7\n##     penalty .metric .estimator  mean     n std_err .config              \n##       &lt;dbl&gt; &lt;chr&gt;   &lt;chr&gt;      &lt;dbl&gt; &lt;int&gt;   &lt;dbl&gt; &lt;chr&gt;                \n##  1 1   e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model01\n##  2 1   e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model01\n##  3 2.61e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model02\n##  4 2.61e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model02\n##  5 6.81e-10 rmse    standard   2.48     15  0.142  Preprocessor1_Model03\n##  6 6.81e-10 rsq     standard   0.851    15  0.0143 Preprocessor1_Model03\n##  7 1.78e- 9 rmse    standard   2.48     15  0.142  Preprocessor1_Model04\n##  8 1.78e- 9 rsq     standard   0.851    15  0.0143 Preprocessor1_Model04\n##  9 4.64e- 9 rmse    standard   2.48     15  0.142  Preprocessor1_Model05\n## 10 4.64e- 9 rsq     standard   0.851    15  0.0143 Preprocessor1_Model05\n## # ℹ 40 more rows\n\n# plot k-folds results across lambda range\nmodel_tune %&gt;%\n  collect_metrics() %&gt;% \n  dplyr::filter(.metric == \"rmse\") %&gt;% \n  ggplot(aes(penalty, mean, color = .metric)) +\n  geom_errorbar(aes( ymin = mean - std_err, ymax = mean + std_err), alpha = 0.5) +\n  scale_x_log10() + \n  geom_line(linewidth = 1.5) +\n  theme_bw() +\n  theme(legend.position = \"none\") +\n  scale_color_brewer(palette = \"Set1\")\n  \n# best lambda value (min. RMSE)\nmodel_best &lt;- model_tune %&gt;%\n  select_best(metric = \"rmse\")\n\nprint(model_best)\n## # A tibble: 1 × 2\n##   penalty .config              \n##     &lt;dbl&gt; &lt;chr&gt;                \n## 1  0.0562 Preprocessor1_Model22\n\n# finalize workflow with tuned model\nwf_final &lt;- wf %&gt;%\n  finalize_workflow(model_best)\n\n# last fit \nfit_final &lt;- wf_final %&gt;%\n  last_fit(split = data_split)\n\n# final predictions\ny_test_pred &lt;- fit_final %&gt;% collect_predictions() # predicted BMI\n\n# final predictions: performance on test (unseen data)\nfit_final %&gt;% collect_metrics() \n## # A tibble: 2 × 4\n##   .metric .estimator .estimate .config             \n##   &lt;chr&gt;   &lt;chr&gt;          &lt;dbl&gt; &lt;chr&gt;               \n## 1 rmse    standard       2.79  Preprocessor1_Model1\n## 2 rsq     standard       0.857 Preprocessor1_Model1\n\n# plot predictions vs. actual for test data\nplot(data_test$BMI, y_test_pred$.pred, xlab=\"BMI (actual)\", ylab = \"BMI (predicted)\", las = 1, pch = 19)\n\n# correlation between predicted and actual BMI values for test data\ncor(data_test$BMI, y_test_pred$.pred)\n## [1] 0.9256857\n\n# re-fit model on all non-test data\nmodel_final &lt;- wf_final %&gt;%\n  fit(data_other) \n\n# show final model\ntidy(model_final)\n## # A tibble: 16 × 3\n##    term            estimate penalty\n##    &lt;chr&gt;              &lt;dbl&gt;   &lt;dbl&gt;\n##  1 (Intercept)      28.7     0.0562\n##  2 chol              0       0.0562\n##  3 glu               0.0229  0.0562\n##  4 hdl               0       0.0562\n##  5 ratio             0.335   0.0562\n##  6 glyhb            -0.0512  0.0562\n##  7 age              -0.257   0.0562\n##  8 height           -1.86    0.0562\n##  9 bp.1s            -0.294   0.0562\n## 10 bp.1d             0.203   0.0562\n## 11 hip               5.60    0.0562\n## 12 time.ppn          0       0.0562\n## 13 location_Louisa  -0.160   0.0562\n## 14 gender_male       1.07    0.0562\n## 15 frame_medium     -0.320   0.0562\n## 16 frame_small      -0.530   0.0562\n\n# plot variables ordered by importance (highest abs(coeff))\nmodel_final %&gt;%\n  extract_fit_parsnip() %&gt;%\n  vip(geom = \"point\") + \n  theme_bw()\n\n\n\n\nMean RMSE plus/minus standard error across repeated cross-validation folds as a function of lambda values\n\n\n\n\n\n\n\nPredicted BMI values against actual BMI values using final model for predicting test (unseen) data\n\n\n\n\n\n\n\nTop feature of importance, here measured as the features with highest absolute value of the Lasso regression coefficients from the final tuned model"
+  }
+]
\ No newline at end of file
diff --git a/session-tidymodels/index.qmd b/session-tidymodels/index.qmd
index c9410f9..9b2c4cc 100644
--- a/session-tidymodels/index.qmd
+++ b/session-tidymodels/index.qmd
@@ -3,6 +3,7 @@
 **Aims**
 
 - to introduce `tidymodels` framework for predictive modelling studies
+- and show how to put all the common steps for building predictive model
 
 **Learning outcomes**
 
diff --git a/session-tidymodels/intro.qmd b/session-tidymodels/intro.qmd
index 99b1431..17f0618 100644
--- a/session-tidymodels/intro.qmd
+++ b/session-tidymodels/intro.qmd
@@ -4,7 +4,7 @@ editor_options:
   chunk_output_type: console
 ---
 
-# Tidymodels
+# Introduction to Tidymodels
 
 We have seen that there are many common steps when using supervised learning for prediction, such as data splitting and parameters tuning. Over the years, some initiatives were taken to create a common framework for the machine learning tasks in R. A while back Max Kuhn was the main developer behind a popular `caret` package that among others enabled feature engineering and control of training parameters like cross-validation. In 2020 `tidymodels`framework was introduced as a collection of R packages for modeling and machine learning using tidyverse principles, under a guidance of Max Kuhn and Hadley Wickham, author of `tidyverse` package.
 
diff --git a/session-tidymodels/session-feature-selection.Rproj b/session-tidymodels/tidymodels.Rproj
similarity index 100%
rename from session-tidymodels/session-feature-selection.Rproj
rename to session-tidymodels/tidymodels.Rproj