Merge branch 'master' of github.com:nxskok/lecture-notes

nxskok · Jan 16, 2024 · 82d4add · 82d4add
2 parents ded601b + ced6b64
commit 82d4add
Show file tree

Hide file tree

Showing 255 changed files with 4,404 additions and 10,470 deletions.
diff --git a/_freeze/ancova/execute-results/html.json b/_freeze/ancova/execute-results/html.json
diff --git a/_freeze/ancova/execute-results/tex.json b/_freeze/ancova/execute-results/tex.json
diff --git a/_freeze/ancova/figure-beamer/ancova-plot-1.pdf b/_freeze/ancova/figure-beamer/ancova-plot-1.pdf
diff --git a/_freeze/ancova/figure-beamer/unnamed-chunk-4-1.pdf b/_freeze/ancova/figure-beamer/unnamed-chunk-4-1.pdf
diff --git a/_freeze/ancova/figure-beamer/unnamed-chunk-6-1.pdf b/_freeze/ancova/figure-beamer/unnamed-chunk-6-1.pdf
diff --git a/_freeze/ancova/figure-revealjs/unnamed-chunk-4-1.png b/_freeze/ancova/figure-revealjs/unnamed-chunk-4-1.png
diff --git a/_freeze/ancova/figure-revealjs/unnamed-chunk-6-1.png b/_freeze/ancova/figure-revealjs/unnamed-chunk-6-1.png
diff --git a/_freeze/asphalt/execute-results/html.json b/_freeze/asphalt/execute-results/html.json
diff --git a/_freeze/asphalt/execute-results/tex.json b/_freeze/asphalt/execute-results/tex.json
diff --git a/_freeze/asphalt/figure-beamer/asphalt-14-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-14-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-15-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-15-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-17-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-17-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-39-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-39-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-41-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-41-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-5-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-5-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/asphalt-9-1.pdf b/_freeze/asphalt/figure-beamer/asphalt-9-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/unnamed-chunk-1-1.pdf b/_freeze/asphalt/figure-beamer/unnamed-chunk-1-1.pdf
diff --git a/_freeze/asphalt/figure-beamer/unnamed-chunk-2-1.pdf b/_freeze/asphalt/figure-beamer/unnamed-chunk-2-1.pdf
diff --git a/_freeze/asphalt/figure-revealjs/unnamed-chunk-1-1.png b/_freeze/asphalt/figure-revealjs/unnamed-chunk-1-1.png
diff --git a/_freeze/asphalt/figure-revealjs/unnamed-chunk-2-1.png b/_freeze/asphalt/figure-revealjs/unnamed-chunk-2-1.png
diff --git a/_freeze/bootstrap_R/execute-results/html.json b/_freeze/bootstrap_R/execute-results/html.json
diff --git a/_freeze/bootstrap_R/execute-results/tex.json b/_freeze/bootstrap_R/execute-results/tex.json
diff --git a/_freeze/bootstrap_R/figure-beamer/bootstrap-R-10-1.pdf b/_freeze/bootstrap_R/figure-beamer/bootstrap-R-10-1.pdf
diff --git a/_freeze/bootstrap_R/figure-beamer/bootstrap-R-12-1.pdf b/_freeze/bootstrap_R/figure-beamer/bootstrap-R-12-1.pdf
diff --git a/_freeze/bootstrap_R/figure-beamer/bootstrap-R-15-1.pdf b/_freeze/bootstrap_R/figure-beamer/bootstrap-R-15-1.pdf
diff --git a/_freeze/bootstrap_R/figure-beamer/bootstrap-R-17-1.pdf b/_freeze/bootstrap_R/figure-beamer/bootstrap-R-17-1.pdf
diff --git a/_freeze/bootstrap_R/figure-beamer/bootstrap-R-19-1.pdf b/_freeze/bootstrap_R/figure-beamer/bootstrap-R-19-1.pdf
diff --git a/_freeze/choosing/execute-results/tex.json b/_freeze/choosing/execute-results/tex.json
diff --git a/_freeze/functions/execute-results/html.json b/_freeze/functions/execute-results/html.json
diff --git a/_freeze/functions/execute-results/tex.json b/_freeze/functions/execute-results/tex.json
diff --git a/_freeze/inference_3/execute-results/html.json b/_freeze/inference_3/execute-results/html.json
diff --git a/_freeze/inference_3/execute-results/tex.json b/_freeze/inference_3/execute-results/tex.json
diff --git a/_freeze/inference_3/figure-beamer/inference-3-R-30-1.pdf b/_freeze/inference_3/figure-beamer/inference-3-R-30-1.pdf
diff --git a/_freeze/inference_3/figure-beamer/inference-3-R-8-1.pdf b/_freeze/inference_3/figure-beamer/inference-3-R-8-1.pdf
diff --git a/_freeze/inference_5a/execute-results/html.json b/_freeze/inference_5a/execute-results/html.json
@@ -1,7 +1,7 @@
 {
-  "hash": "156b7a8df976c1e3705c3cbcf5629678",
+  "hash": "8f9406f10bd58457eddca218b9249988",
   "result": {
-    "markdown": "---\ntitle: \"Mood's Median Test\"\n---\n\n\n\n## Packages\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\nlibrary(smmr)\n```\n:::\n\n\n\n## Two-sample test: What to do if normality fails\n\n- If normality fails (for one or both of the groups), what do we do then?\n- Again, can compare medians: use the thought process of the sign test,\nwhich does not depend on normality and is not damaged by outliers.\n- A suitable test called Mood’s median test.\n- Before we get to that, a diversion.\n\n## The chi-squared test for independence\n\nSuppose we want to know whether people are in favour of having\ndaylight savings time all year round. We ask 20 males and 20 females\nwhether they each agree with having DST all year round (“yes”) or\nnot (“no”). Some randomly chosen data:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_url <- \"http://ritsokiguess.site/datafiles/dst.txt\"\ndst <- read_delim(my_url,\" \")\ndst %>% slice_sample(n = 10)\n```\n\n::: {.cell-output-display}\n`````{=html}\n<div data-pagedtable=\"false\">\n  <script data-pagedtable-source type=\"application/json\">\n{\"columns\":[{\"label\":[\"gender\"],\"name\":[1],\"type\":[\"chr\"],\"align\":[\"left\"]},{\"label\":[\"agree\"],\"name\":[2],\"type\":[\"chr\"],\"align\":[\"left\"]}],\"data\":[{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"no\"},{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"no\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"}],\"options\":{\"columns\":{\"min\":{},\"max\":[10]},\"rows\":{\"min\":[10],\"max\":[10]},\"pages\":{}}}\n  </script>\n</div>\n`````\n:::\n:::\n\n\n## ... continued\n\nCount up individuals in each category combination, and arrange in\ncontingency table:\n\n::: {.cell}\n\n```{.r .cell-code}\ntab <- with(dst, table(gender, agree))\ntab\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n        agree\ngender   no yes\n  female 11   9\n  male    3  17\n```\n:::\n:::\n\n\n- Most of the males say “yes”, but the females are about evenly split.\n- Looks like males more likely to say “yes”, ie. an association between\ngender and agreement.\n- Test an $H_0$ of “no association” (“independence”) vs. alternative that\nthere is really some association. \n- Done with `chisq.test`.\n\n## ...And finally\n\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(tab, correct=FALSE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's Chi-squared test\n\ndata:  tab\nX-squared = 7.033, df = 1, p-value = 0.008002\n```\n:::\n:::\n\n\n- Reject null hypothesis of no association (P-value 0.008)\n- therefore there is a difference in rates of agreement between (all)\nmales and females (or that gender and agreement are associated).\n- This calculation gives same answers as you would get by hand. (Omitting `correct = FALSE` uses “Yates correction”.\n\n## Mood’s median test\n- Earlier: compare medians of two groups.\n- Sign test: count number of values above and below something\n(there, hypothesized median).\n- Mood’s median test:\n  - Find \"grand median\" of all the data, regardless of group\n  - Count data values in each group above/below grand\nmedian.\n  - Make contingency table of group vs. above/below.\n  - Test for association.\n- If group medians equal, each group should have about half its\nobservations above/below grand median. If not, one group will be\nmostly above grand median and other below.\n\n## Mood’s median test for reading data\n\n\n::: {.cell}\n\n:::\n\n\n\n- Find overall median score: \n\n::: {.cell}\n\n```{.r .cell-code}\nkids %>% summarize(med=median(score)) %>% pull(med) -> m\nm\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 47\n```\n:::\n:::\n\n\n- Make table of above/below vs. group:\n\n::: {.cell}\n\n```{.r .cell-code}\ntab <- with(kids, table(group, score > m))\ntab\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     \ngroup FALSE TRUE\n    c    15    8\n    t     7   14\n```\n:::\n:::\n\n\n\n- Treatment group scores mostly above median, control group scores\nmostly below, as expected.\n\n## The test\n- Do chi-squared test:\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(tab,correct=F)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's Chi-squared test\n\ndata:  tab\nX-squared = 4.4638, df = 1, p-value = 0.03462\n```\n:::\n:::\n\n\n\n- This test actually two-sided (tests for any association). \n- Here want to test that new reading method *better* (one-sided).\n- Most of treatment children above overall median, so\ndo 1-sided test by halving P-value to get 0.017. \n- This way too, children do better at learning to read using the new\nmethod.\n\n## Or by smmr\n- `median_test` does the whole thing:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmedian_test(kids,score,group)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n$grand_median\n[1] 47\n\n$table\n     above\ngroup above below\n    c     8    15\n    t    14     7\n\n$test\n       what      value\n1 statistic 4.46376812\n2        df 1.00000000\n3   P-value 0.03462105\n```\n:::\n:::\n\n\n- P-value again two-sided.\n\n## Comments\n- P-value 0.013 for (1-sided) t-test, 0.017 for (1-sided) Mood median\ntest.\n- Like the sign test, Mood’s median test doesn’t use the data very\nefficiently (only, is each value above or below grand median).\n- Thus, if we can justify doing *t*-test, we should do it. This is the case\nhere.\n- The *t*-test will usually give smaller P-value because it uses the data\nmore efficiently.\n- The time to use Mood’s median test is if we are definitely unhappy\nwith the normality assumption (and thus the t-test P-value is not to\nbe trusted).\n\n",
+    "markdown": "---\ntitle: \"Mood's Median Test\"\neditor: \n  markdown: \n    wrap: 72\n---\n\n\n## Packages\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(tidyverse)\nlibrary(smmr)\n```\n:::\n\n\n## Two-sample test: What to do if normality fails\n\n-   If normality fails (for one or both of the groups), what do we do\n    then?\n-   Again, can compare medians: use the thought process of the sign\n    test, which does not depend on normality and is not damaged by\n    outliers.\n-   A suitable test called Mood's median test.\n-   Before we get to that, a diversion.\n\n## The chi-squared test for independence\n\nSuppose we want to know whether people are in favour of having daylight\nsavings time all year round. We ask 20 males and 20 females whether they\neach agree with having DST all year round (\"yes\") or not (\"no\"). Some\nrandomly chosen data:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmy_url <- \"http://ritsokiguess.site/datafiles/dst.txt\"\ndst <- read_delim(my_url,\" \")\ndst %>% slice_sample(n = 10)\n```\n\n::: {.cell-output-display}\n`````{=html}\n<div data-pagedtable=\"false\">\n  <script data-pagedtable-source type=\"application/json\">\n{\"columns\":[{\"label\":[\"gender\"],\"name\":[1],\"type\":[\"chr\"],\"align\":[\"left\"]},{\"label\":[\"agree\"],\"name\":[2],\"type\":[\"chr\"],\"align\":[\"left\"]}],\"data\":[{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"yes\"},{\"1\":\"female\",\"2\":\"no\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"},{\"1\":\"male\",\"2\":\"yes\"}],\"options\":{\"columns\":{\"min\":{},\"max\":[10]},\"rows\":{\"min\":[10],\"max\":[10]},\"pages\":{}}}\n  </script>\n</div>\n`````\n:::\n:::\n\n\n## ... continued\n\nCount up individuals in each category combination, and arrange in\ncontingency table:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntab <- with(dst, table(gender, agree))\ntab\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n        agree\ngender   no yes\n  female 11   9\n  male    3  17\n```\n:::\n:::\n\n\n-   Most of the males say \"yes\", but the females are about evenly split.\n-   Looks like males more likely to say \"yes\", ie. an association\n    between gender and agreement.\n-   Test an $H_0$ of \"no association\" (\"independence\") vs. alternative\n    that there is really some association.\n-   Done with `chisq.test`.\n\n## ...And finally\n\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(tab, correct=FALSE)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's Chi-squared test\n\ndata:  tab\nX-squared = 7.033, df = 1, p-value = 0.008002\n```\n:::\n:::\n\n\n-   Reject null hypothesis of no association (P-value 0.008)\n-   therefore there is a difference in rates of agreement between (all)\n    males and females (or that gender and agreement are associated).\n-   This calculation gives same answers as you would get by hand.\n    (Omitting `correct = FALSE` uses \"Yates correction\".\n\n## Mood's median test\n\n-   Earlier: compare medians of two groups.\n-   Sign test: count number of values above and below something (there,\n    hypothesized median).\n-   Mood's median test:\n    -   Find \"grand median\" of all the data, regardless of group\n    -   Count data values in each group above/below grand median.\n    -   Make contingency table of group vs. above/below.\n    -   Test for association.\n-   If group medians equal, each group should have about half its\n    observations above/below grand median. If not, one group will be\n    mostly above grand median and other below.\n\n## Mood's median test for reading data\n\n\n::: {.cell}\n\n:::\n\n\n-   Find overall median score:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nkids %>% summarize(med=median(score)) %>% pull(med) -> m\nm\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n[1] 47\n```\n:::\n:::\n\n\n-   Make table of above/below vs. group:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntab <- with(kids, table(group, score > m))\ntab\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n     \ngroup FALSE TRUE\n    c    15    8\n    t     7   14\n```\n:::\n:::\n\n\n-   Treatment group scores mostly above median, control group scores\n    mostly below, as expected.\n\n## The test\n\n-   Do chi-squared test:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nchisq.test(tab, correct=F)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n\n\tPearson's Chi-squared test\n\ndata:  tab\nX-squared = 4.4638, df = 1, p-value = 0.03462\n```\n:::\n:::\n\n\n-   This test actually two-sided (tests for any association).\n-   Here want to test that new reading method *better* (one-sided).\n-   Most of treatment children above overall median, so do 1-sided test\n    by halving P-value to get 0.017.\n-   This way too, children do better at learning to read using the new\n    method.\n\n## Or by smmr\n\n-   `median_test` does the whole thing:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nmedian_test(kids,score,group)\n```\n\n::: {.cell-output .cell-output-stdout}\n```\n$grand_median\n[1] 47\n\n$table\n     above\ngroup above below\n    c     8    15\n    t    14     7\n\n$test\n       what      value\n1 statistic 4.46376812\n2        df 1.00000000\n3   P-value 0.03462105\n```\n:::\n:::\n\n\n-   P-value again two-sided.\n\n## Comments\n\n-   P-value 0.013 for (1-sided) t-test, 0.017 for (1-sided) Mood median\n    test.\n-   Like the sign test, Mood's median test doesn't use the data very\n    efficiently (only, is each value above or below grand median).\n-   Thus, if we can justify doing *t*-test, we should do it. This is the\n    case here.\n-   The *t*-test will usually give smaller P-value because it uses the\n    data more efficiently.\n-   The time to use Mood's median test is if we are definitely unhappy\n    with the normality assumption (and thus the t-test P-value is not to\n    be trusted).\n",
     "supporting": [
       "inference_5a_files"
     ],