Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
ethanweed committed Apr 26, 2024
1 parent e5d1fc9 commit f03ae59
Show file tree
Hide file tree
Showing 5 changed files with 229 additions and 183 deletions.
40 changes: 20 additions & 20 deletions 05.04-regression.html
Original file line number Diff line number Diff line change
Expand Up @@ -3021,19 +3021,21 @@ <h3><span class="section-number">16.9.1. </span>Three kinds of residuals<a class

<span class="n">mod2</span> <span class="o">=</span> <span class="n">pg</span><span class="o">.</span><span class="n">linear_regression</span><span class="p">(</span><span class="n">predictors</span><span class="p">,</span> <span class="n">outcome</span><span class="p">,</span> <span class="n">as_dataframe</span> <span class="o">=</span> <span class="kc">False</span><span class="p">)</span>

<span class="c1">#df_slplot = pd.DataFrame(</span>
<span class="c1"># {&#39;fitted&#39;: mod2[&#39;pred&#39;],</span>
<span class="c1"># &#39;sqrt_abs_stand_res&#39;: np.sqrt(np.abs(mod2[&#39;residuals&#39;]))</span>
<span class="c1"># })</span>

<span class="n">SS_resid</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sum</span><span class="p">(</span><span class="n">mod2</span><span class="p">[</span><span class="s2">&quot;residuals&quot;</span><span class="p">]</span><span class="o">**</span><span class="mi">2</span><span class="p">)</span>
<span class="n">n</span> <span class="o">=</span> <span class="nb">len</span><span class="p">(</span><span class="n">mod2</span><span class="p">[</span><span class="s1">&#39;residuals&#39;</span><span class="p">])</span>
<span class="n">p</span> <span class="o">=</span> <span class="mi">2</span> <span class="c1"># bcs 2 predictors = &#39;dan_sleep&#39; and &#39;baby_sleep&#39; </span>
<span class="n">sigmahat</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="nb">abs</span><span class="p">(</span> <span class="n">SS_resid</span><span class="o">/</span><span class="p">(</span><span class="n">n</span><span class="o">-</span><span class="n">p</span><span class="o">-</span><span class="mi">1</span><span class="p">)</span> <span class="p">))</span>
<span class="n">stand_res</span> <span class="o">=</span> <span class="n">mod2</span><span class="p">[</span><span class="s1">&#39;residuals&#39;</span><span class="p">]</span> <span class="o">/</span> <span class="n">sigmahat</span>
<span class="n">df_slplot</span> <span class="o">=</span> <span class="n">pd</span><span class="o">.</span><span class="n">DataFrame</span><span class="p">(</span>
<span class="p">{</span><span class="s1">&#39;fitted&#39;</span><span class="p">:</span> <span class="n">mod2</span><span class="p">[</span><span class="s1">&#39;pred&#39;</span><span class="p">],</span>
<span class="s1">&#39;sqrt_abs_stand_res&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">mod2</span><span class="p">[</span><span class="s1">&#39;residuals&#39;</span><span class="p">]))</span>
<span class="s1">&#39;sqrt_abs_stand_res&#39;</span><span class="p">:</span> <span class="n">np</span><span class="o">.</span><span class="n">sqrt</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">abs</span><span class="p">(</span><span class="n">stand_res</span><span class="p">))</span>
<span class="p">})</span>

<span class="c1">#SS_resid = np.sum(mod2[&quot;residuals&quot;]**2)</span>
<span class="c1">#p = 2 # bcs 2 predictors = &#39;dan_sleep&#39; and &#39;baby_sleep&#39; </span>
<span class="c1">#sigmahat = np.sqrt(abs(( SS_resid/(n-p-1) )))</span>
<span class="c1">#stand_res = mod2[&#39;residuals&#39;] / sigmahat</span>
<span class="c1">#df_slplot = pd.DataFrame(</span>
<span class="c1"># {&#39;fitted&#39;: mod2[&#39;pred&#39;],</span>
<span class="c1"># &#39;sqrt_abs_stand_res&#39;: np.sqrt(np.abs(stand_res))</span>
<span class="c1"># })</span>

<span class="n">fig</span> <span class="o">=</span> <span class="n">plt</span><span class="o">.</span><span class="n">figure</span><span class="p">()</span>

Expand All @@ -3047,21 +3049,19 @@ <h3><span class="section-number">16.9.1. </span>Three kinds of residuals<a class
<span class="n">sns</span><span class="o">.</span><span class="n">despine</span><span class="p">()</span>



<span class="c1"># Plot figure in book, with caption</span>

<span class="n">glue</span><span class="p">(</span><span class="s2">&quot;sl-plot-fig&quot;</span><span class="p">,</span> <span class="n">fig</span><span class="p">,</span> <span class="n">display</span><span class="o">=</span><span class="kc">False</span><span class="p">)</span>
<span class="n">plt</span><span class="o">.</span><span class="n">close</span><span class="p">(</span><span class="n">fig</span><span class="p">)</span>
<span class="c1">#glue(&quot;sl-plot-fig&quot;, fig, display=False)</span>
<span class="c1">#plt.close(fig)</span>
</pre></div>
</div>
</div>
</details>
<div class="cell_output docutils container">
<img alt="_images/e75a129aca481a4438eb01b826064a6e616131da31107f00143325d380a62677.png" src="_images/e75a129aca481a4438eb01b826064a6e616131da31107f00143325d380a62677.png" />
</div>
</div>
<figure class="align-default" id="fig-sl-plot" style="width: 600px">
<img alt="_images/7f7ccd94eaa177f7634f35b0e877a1def5aadaffa4c9cb88071efd639610c042.png" src="_images/7f7ccd94eaa177f7634f35b0e877a1def5aadaffa4c9cb88071efd639610c042.png" />
<figcaption>
<p><span class="caption-number">Fig. 16.12 </span><span class="caption-text">Plot of the fitted values (model predictions) against the square root of the abs standardised residuals. This plot is used to diagnose violations of homogeneity of variance. If the variance is really constant, then the line through the middle should be horizontal and flat.</span><a class="headerlink" href="#fig-sl-plot" title="Permalink to this image">#</a></p>
</figcaption>
</figure>
<p>A slightly more formal approach is to run a hypothesis test such as the Breusch–Pagan test <span id="id13">[<a class="reference internal" href="bibliography.html#id85" title="Trevor S Breusch and Adrian R Pagan. A simple test for heteroscedasticity and random coefficient variation. Econometrica: Journal of the econometric society, pages 1287–1294, 1979.">Breusch and Pagan, 1979</a>]</span>. Unfortunately, to run the test, we’ll have to leave the cozy world of <code class="docutils literal notranslate"><span class="pre">pingouin</span></code> and use <code class="docutils literal notranslate"><span class="pre">statsmodels</span></code> instead, but luckily the code is not complicated. Basically, we just run our regression in <code class="docutils literal notranslate"><span class="pre">statsmodels</span></code> instead of <code class="docutils literal notranslate"><span class="pre">pinguoin</span></code>, and then run the Breusch-Pagan test on the output (see below). I won’t go into how it works, other than to say that after we fit our regression model, we then fit <em>another</em> regression model in which we use our predicted values to predict our residuals. We then use something called the “Lagrange multiplier statistic” (similar to a <span class="math notranslate nohighlight">\(x^2\)</span> test) to check the significance of our new model. If the new model is not significant, this can be used to support the assumption that we are indeed dealing with a relationship that can be modelled linearly.</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
Expand Down Expand Up @@ -3192,7 +3192,7 @@ <h3><span class="section-number">16.9.1. </span>Three kinds of residuals<a class
<th>Date:</th> <td>Fri, 26 Apr 2024</td> <th> Prob (F-statistic):</th> <td>2.15e-36</td>
</tr>
<tr>
<th>Time:</th> <td>15:41:47</td> <th> Log-Likelihood: </th> <td> -287.48</td>
<th>Time:</th> <td>20:17:20</td> <th> Log-Likelihood: </th> <td> -287.48</td>
</tr>
<tr>
<th>No. Observations:</th> <td> 100</td> <th> AIC: </th> <td> 581.0</td>
Expand Down Expand Up @@ -3260,7 +3260,7 @@ <h3><span class="section-number">16.9.1. </span>Three kinds of residuals<a class
<th>Date:</th> <td>Fri, 26 Apr 2024</td> <th> Prob (F-statistic):</th> <td>2.78e-35</td>
</tr>
<tr>
<th>Time:</th> <td>15:41:47</td> <th> Log-Likelihood: </th> <td> -287.48</td>
<th>Time:</th> <td>20:17:20</td> <th> Log-Likelihood: </th> <td> -287.48</td>
</tr>
<tr>
<th>No. Observations:</th> <td> 100</td> <th> AIC: </th> <td> 581.0</td>
Expand Down Expand Up @@ -3727,7 +3727,7 @@ <h3><span class="section-number">16.10.1. </span>Backward elimination<a class="h
<th>Date:</th> <td>Fri, 26 Apr 2024</td> <th> Prob (F-statistic):</th> <td>3.42e-35</td>
</tr>
<tr>
<th>Time:</th> <td>15:41:47</td> <th> Log-Likelihood: </th> <td> -287.43</td>
<th>Time:</th> <td>20:17:20</td> <th> Log-Likelihood: </th> <td> -287.43</td>
</tr>
<tr>
<th>No. Observations:</th> <td> 100</td> <th> AIC: </th> <td> 582.9</td>
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit f03ae59

Please sign in to comment.