diff --git a/DESCRIPTION b/DESCRIPTION index bc405a3..f6d71b7 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: rstatix Type: Package Title: Pipe-Friendly Framework for Basic Statistical Tests -Version: 0.7.1.999 +Version: 0.7.2 Authors@R: c( person("Alboukadel", "Kassambara", role = c("aut", "cre"), email = "alboukadel.kassambara@gmail.com")) Description: Provides a simple and intuitive pipe-friendly framework, coherent with the 'tidyverse' design philosophy, @@ -43,7 +43,7 @@ Suggests: spelling URL: https://rpkgs.datanovia.com/rstatix/ BugReports: https://github.com/kassambara/rstatix/issues -RoxygenNote: 7.1.0 +RoxygenNote: 7.2.3 Collate: 'utilities.R' 'add_significance.R' diff --git a/NEWS.md b/NEWS.md index 75fd124..3734879 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,12 +1,9 @@ -# rstatix 0.7.1.999 +# rstatix 0.7.2 -## New features - -## Major changes - ## Minor changes - Required `tidyselect` versions is `>= 1.2.0` + ## Bug fixes - `emmeans_test()`: restoring grouping variable class (`factor`) in the final results `emmeans_test()` (#169) diff --git a/_pkgdown.yml b/_pkgdown.yml index 0b73e5f..fe87fe9 100644 --- a/_pkgdown.yml +++ b/_pkgdown.yml @@ -77,7 +77,7 @@ reference: contents: - adjust_pvalue - add_significance - - p_value + - p_round - title: Extract Information From Statistical Tests contents: - get_test_label diff --git a/cran-comments.md b/cran-comments.md index 68d5861..93a731d 100644 --- a/cran-comments.md +++ b/cran-comments.md @@ -8,4 +8,4 @@ There were no ERRORs, WARNINGs or NOTEs. ## Update -This is an updated version 0.7.1 (see NEWS.md). +This is an updated version 0.7.2 (see NEWS.md). diff --git a/docs/404.html b/docs/404.html index 4170c41..4a549be 100644 --- a/docs/404.html +++ b/docs/404.html @@ -1,68 +1,27 @@ - - -
- + + + + -Alboukadel Kassambara. Author, maintainer. +
+DESCRIPTION
+ Alboukadel Kassambara. Author, maintainer. -
-Kassambara A (2023). +rstatix: Pipe-Friendly Framework for Basic Statistical Tests. +R package version 0.7.2, https://rpkgs.datanovia.com/rstatix/. +
+@Manual{, + title = {rstatix: Pipe-Friendly Framework for Basic Statistical Tests}, + author = {Alboukadel Kassambara}, + year = {2023}, + note = {R package version 0.7.2}, + url = {https://rpkgs.datanovia.com/rstatix/}, +}
Provides a simple and intuitive pipe-friendly framework, coherent with the ‘tidyverse’ design philosophy, for performing basic statistical tests, including t-test, Wilcoxon test, ANOVA, Kruskal-Wallis and correlation analyses.
The output of each test is automatically transformed into a tidy data frame to facilitate visualization.
Additional functions are available for reshaping, reordering, manipulating and visualizing correlation matrix. Functions are also included to facilitate the analysis of factorial experiments, including purely ‘within-Ss’ designs (repeated measures), purely ‘between-Ss’ designs, and mixed ‘within-and-between-Ss’ designs.
It’s also possible to compute several effect size metrics, including “eta squared” for ANOVA, “Cohen’s d” for t-test and “Cramer’s V” for the association between categorical variables. The package contains helper functions for identifying univariate and multivariate outliers, assessing normality and homogeneity of variances.
-get_summary_stats()
: Compute summary statistics for one or multiple numeric variables. Can handle grouped data.shapiro_test()
and mshapiro_test()
: Univariate and multivariate Shapiro-Wilk normality test.
t_test()
: perform one-sample, two-sample and pairwise t-testssign_test()
: perform sign test to determine whether there is a median difference between paired or matched observations.anova_test()
: an easy-to-use wrapper around car::Anova()
to perform different types of ANOVA tests, including independent measures ANOVA, repeated measures ANOVA and mixed ANOVA.anova_test()
: an easy-to-use wrapper around car::Anova()
to perform different types of ANOVA tests, including independent measures ANOVA, repeated measures ANOVA and mixed ANOVA.
get_anova_test_table()
: extract ANOVA table from anova_test()
results. Can apply sphericity correction automatically in the case of within-subject (repeated measures) designs.welch_anova_test()
: Welch one-Way ANOVA test. A pipe-friendly wrapper around the base function stats::oneway.test()
. This is is an alternative to the standard one-way ANOVA in the situation where the homogeneity of variance assumption is violated.welch_anova_test()
: Welch one-Way ANOVA test. A pipe-friendly wrapper around the base function stats::oneway.test()
. This is is an alternative to the standard one-way ANOVA in the situation where the homogeneity of variance assumption is violated.
kruskal_test()
: perform kruskal-wallis rank sum testget_pvalue_position()
: autocompute p-value positions for plotting significance using ggplot2.factorial_design()
: build factorial design for easily computing ANOVA using the car::Anova()
function. This might be very useful for repeated measures ANOVA, which is hard to set up with the car
package.factorial_design()
: build factorial design for easily computing ANOVA using the car::Anova()
function. This might be very useful for repeated measures ANOVA, which is hard to set up with the car
package.
anova_summary()
: Create beautiful summary tables of ANOVA test results obtained from either car::Anova()
or stats::aov()
. The results include ANOVA table, generalized effect size and some assumption checks, such as Mauchly’s test for sphericity in the case of repeated measures ANOVA.anova_summary()
: Create beautiful summary tables of ANOVA test results obtained from either car::Anova()
or stats::aov()
. The results include ANOVA table, generalized effect size and some assumption checks, such as Mauchly’s test for sphericity in the case of repeated measures ANOVA.
tukey_hsd()
: performs tukey post-hoc tests. Can handle different inputs formats: aov, lm, formula.emmeans_test()
: pipe-friendly wrapper arround emmeans
function to perform pairwise comparisons of estimated marginal means. Useful for post-hoc analyses following up ANOVA/ANCOVA tests.
prop_test()
, pairwise_prop_test()
and row_wise_prop_test()
. Performs one-sample and two-samples z-test of proportions. Wrappers around the R base function prop.test()
but have the advantage of performing pairwise and row-wise z-test of two proportions, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.prop_test()
, pairwise_prop_test()
and row_wise_prop_test()
. Performs one-sample and two-samples z-test of proportions. Wrappers around the R base function prop.test()
but have the advantage of performing pairwise and row-wise z-test of two proportions, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.
fisher_test()
, pairwise_fisher_test()
and row_wise_fisher_test()
: Fisher’s exact test for count data. Wrappers around the R base function fisher.test()
but have the advantage of performing pairwise and row-wise fisher tests, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.fisher_test()
, pairwise_fisher_test()
and row_wise_fisher_test()
: Fisher’s exact test for count data. Wrappers around the R base function fisher.test()
but have the advantage of performing pairwise and row-wise fisher tests, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.
chisq_test()
, pairwise_chisq_gof_test()
, pairwise_chisq_test_against_p()
: Performs chi-squared tests, including goodness-of-fit, homogeneity and independence tests.prop_trend_test()
: Performs chi-squared test for trend in proportion. This test is also known as Cochran-Armitage trend test.levene_test()
: Pipe-friendly framework to easily compute Levene’s test for homogeneity of variance across groups. Handles grouped data.box_m()
: Box’s M-test for homogeneity of covariance matrices
cohens_d()
: Compute cohen’s d measure of effect size for t-tests.cramer_v()
: Compute Cramer’s V, which measures the strength of the association between categorical variables.
Computing correlation:
cor_mark_significant()
: add significance levels to a correlation matrix.adjust_pvalue()
: add an adjusted p-values column to a data frame containing statistical test p-valuesp_round(), p_format(), p_mark_significant()
: rounding and formatting p-values
Extract information from statistical test results. Useful for labelling plots with test outputs.
create_test_label()
: Create labels from user specified test results.These functions are internally used in the rstatix
and in the ggpubr
R package to make it easy to program with tidyverse packages using non standard evaluation.
df_get_var_names()
: Returns user specified variable names. Supports standard and non standard evaluation.doo()
: alternative to dplyr::do for doing anything. Technically it uses nest() + mutate() + map()
to apply arbitrary computation to a grouped data frame.if(!require(devtools)) install.packages("devtools") -devtools::install_github("kassambara/rstatix")
install.packages("rstatix")
# Summary statistics of some selected variables -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -iris %>% - get_summary_stats(Sepal.Length, Sepal.Width, type = "common") -#> # A tibble: 2 x 10 -#> variable n min max median iqr mean sd se ci -#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 Sepal.Length 150 4.3 7.9 5.8 1.3 5.84 0.828 0.068 0.134 -#> 2 Sepal.Width 150 2 4.4 3 0.5 3.06 0.436 0.036 0.07 - -# Whole data frame -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -iris %>% get_summary_stats(type = "common") -#> # A tibble: 4 x 10 -#> variable n min max median iqr mean sd se ci -#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 Sepal.Length 150 4.3 7.9 5.8 1.3 5.84 0.828 0.068 0.134 -#> 2 Sepal.Width 150 2 4.4 3 0.5 3.06 0.436 0.036 0.07 -#> 3 Petal.Length 150 1 6.9 4.35 3.5 3.76 1.76 0.144 0.285 -#> 4 Petal.Width 150 0.1 2.5 1.3 1.5 1.20 0.762 0.062 0.123 - - -# Grouped data -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -iris %>% - group_by(Species) %>% - get_summary_stats(Sepal.Length, type = "mean_sd") -#> # A tibble: 3 x 5 -#> Species variable n mean sd -#> <fct> <fct> <dbl> <dbl> <dbl> -#> 1 setosa Sepal.Length 50 5.01 0.352 -#> 2 versicolor Sepal.Length 50 5.94 0.516 -#> 3 virginica Sepal.Length 50 6.59 0.636
+if(!require(devtools)) install.packages("devtools")
+devtools::install_github("kassambara/rstatix")
+install.packages("rstatix")
+# Summary statistics of some selected variables
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+iris %>%
+ get_summary_stats(Sepal.Length, Sepal.Width, type = "common")
+#> # A tibble: 2 x 10
+#> variable n min max median iqr mean sd se ci
+#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
+#> 1 Sepal.Length 150 4.3 7.9 5.8 1.3 5.84 0.828 0.068 0.134
+#> 2 Sepal.Width 150 2 4.4 3 0.5 3.06 0.436 0.036 0.07
+
+# Whole data frame
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+iris %>% get_summary_stats(type = "common")
+#> # A tibble: 4 x 10
+#> variable n min max median iqr mean sd se ci
+#> <fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
+#> 1 Sepal.Length 150 4.3 7.9 5.8 1.3 5.84 0.828 0.068 0.134
+#> 2 Sepal.Width 150 2 4.4 3 0.5 3.06 0.436 0.036 0.07
+#> 3 Petal.Length 150 1 6.9 4.35 3.5 3.76 1.76 0.144 0.285
+#> 4 Petal.Width 150 0.1 2.5 1.3 1.5 1.20 0.762 0.062 0.123
+
+
+# Grouped data
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+iris %>%
+ group_by(Species) %>%
+ get_summary_stats(Sepal.Length, type = "mean_sd")
+#> # A tibble: 3 x 5
+#> Species variable n mean sd
+#> <fct> <fct> <dbl> <dbl> <dbl>
+#> 1 setosa Sepal.Length 50 5.01 0.352
+#> 2 versicolor Sepal.Length 50 5.94 0.516
+#> 3 virginica Sepal.Length 50 6.59 0.636
To compare the means of two groups, you can use either the function t_test()
(parametric) or wilcox_test()
(non-parametric). In the following example the t-test will be illustrated.
The one-sample test is used to compare the mean of one sample to a known standard (or theoretical / hypothetical) mean (mu
).
df %>% t_test(len ~ 1, mu = 0) -#> # A tibble: 1 x 7 -#> .y. group1 group2 n statistic df p -#> * <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> -#> 1 len 1 null model 60 19.1 59 6.94e-27 -# One-sample test of each dose level -df %>% - group_by(dose) %>% - t_test(len ~ 1, mu = 0) -#> # A tibble: 3 x 8 -#> dose .y. group1 group2 n statistic df p -#> * <fct> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl> -#> 1 0.5 len 1 null model 20 10.5 19 2.24e- 9 -#> 2 1 len 1 null model 20 20.0 19 3.22e-14 -#> 3 2 len 1 null model 20 30.9 19 1.03e-17
# T-test -stat.test <- df %>% - t_test(len ~ supp, paired = FALSE) -stat.test -#> # A tibble: 1 x 8 -#> .y. group1 group2 n1 n2 statistic df p -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> -#> 1 len OJ VC 30 30 1.92 55.3 0.0606 - -# Create a box plot -p <- ggboxplot( - df, x = "supp", y = "len", - color = "supp", palette = "jco", ylim = c(0,40) - ) -# Add the p-value manually -p + stat_pvalue_manual(stat.test, label = "p", y.position = 35)
+df %>% t_test(len ~ 1, mu = 0)
+#> # A tibble: 1 x 7
+#> .y. group1 group2 n statistic df p
+#> * <chr> <chr> <chr> <int> <dbl> <dbl> <dbl>
+#> 1 len 1 null model 60 19.1 59 6.94e-27
+# One-sample test of each dose level
+df %>%
+ group_by(dose) %>%
+ t_test(len ~ 1, mu = 0)
+#> # A tibble: 3 x 8
+#> dose .y. group1 group2 n statistic df p
+#> * <fct> <chr> <chr> <chr> <int> <dbl> <dbl> <dbl>
+#> 1 0.5 len 1 null model 20 10.5 19 2.24e- 9
+#> 2 1 len 1 null model 20 20.0 19 3.22e-14
+#> 3 2 len 1 null model 20 30.9 19 1.03e-17
+# T-test
+stat.test <- df %>%
+ t_test(len ~ supp, paired = FALSE)
+stat.test
+#> # A tibble: 1 x 8
+#> .y. group1 group2 n1 n2 statistic df p
+#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
+#> 1 len OJ VC 30 30 1.92 55.3 0.0606
+
+# Create a box plot
+p <- ggboxplot(
+ df, x = "supp", y = "len",
+ color = "supp", palette = "jco", ylim = c(0,40)
+ )
+# Add the p-value manually
+p + stat_pvalue_manual(stat.test, label = "p", y.position = 35)
p +stat_pvalue_manual(stat.test, label = "T-test, p = {p}", - y.position = 36)
+p +stat_pvalue_manual(stat.test, label = "T-test, p = {p}",
+ y.position = 36)
# Statistical test -stat.test <- df %>% - group_by(dose) %>% - t_test(len ~ supp) %>% - adjust_pvalue() %>% - add_significance("p.adj") -stat.test -#> # A tibble: 3 x 11 -#> dose .y. group1 group2 n1 n2 statistic df p p.adj -#> <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> -#> 1 0.5 len OJ VC 10 10 3.17 15.0 0.00636 0.0127 -#> 2 1 len OJ VC 10 10 4.03 15.4 0.00104 0.00312 -#> 3 2 len OJ VC 10 10 -0.0461 14.0 0.964 0.964 -#> # … with 1 more variable: p.adj.signif <chr> - -# Visualization -ggboxplot( - df, x = "supp", y = "len", - color = "supp", palette = "jco", facet.by = "dose", - ylim = c(0, 40) - ) + - stat_pvalue_manual(stat.test, label = "p.adj", y.position = 35)
+# Statistical test
+stat.test <- df %>%
+ group_by(dose) %>%
+ t_test(len ~ supp) %>%
+ adjust_pvalue() %>%
+ add_significance("p.adj")
+stat.test
+#> # A tibble: 3 x 11
+#> dose .y. group1 group2 n1 n2 statistic df p p.adj
+#> <fct> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl>
+#> 1 0.5 len OJ VC 10 10 3.17 15.0 0.00636 0.0127
+#> 2 1 len OJ VC 10 10 4.03 15.4 0.00104 0.00312
+#> 3 2 len OJ VC 10 10 -0.0461 14.0 0.964 0.964
+#> # … with 1 more variable: p.adj.signif <chr>
+
+# Visualization
+ggboxplot(
+ df, x = "supp", y = "len",
+ color = "supp", palette = "jco", facet.by = "dose",
+ ylim = c(0, 40)
+ ) +
+ stat_pvalue_manual(stat.test, label = "p.adj", y.position = 35)
# T-test -stat.test <- df %>% - t_test(len ~ supp, paired = TRUE) -stat.test -#> # A tibble: 1 x 8 -#> .y. group1 group2 n1 n2 statistic df p -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> -#> 1 len OJ VC 30 30 3.30 29 0.00255 - -# Box plot -p <- ggpaired( - df, x = "supp", y = "len", color = "supp", palette = "jco", - line.color = "gray", line.size = 0.4, ylim = c(0, 40) - ) -p + stat_pvalue_manual(stat.test, label = "p", y.position = 36)
+# T-test
+stat.test <- df %>%
+ t_test(len ~ supp, paired = TRUE)
+stat.test
+#> # A tibble: 1 x 8
+#> .y. group1 group2 n1 n2 statistic df p
+#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl>
+#> 1 len OJ VC 30 30 3.30 29 0.00255
+
+# Box plot
+p <- ggpaired(
+ df, x = "supp", y = "len", color = "supp", palette = "jco",
+ line.color = "gray", line.size = 0.4, ylim = c(0, 40)
+ )
+p + stat_pvalue_manual(stat.test, label = "p", y.position = 36)
# Pairwise t-test -pairwise.test <- df %>% t_test(len ~ dose) -pairwise.test -#> # A tibble: 3 x 10 -#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 **** -#> 2 len 0.5 2 20 20 -11.8 36.9 4.40e-14 1.32e-13 **** -#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 **** -# Box plot -ggboxplot(df, x = "dose", y = "len")+ - stat_pvalue_manual( - pairwise.test, label = "p.adj", - y.position = c(29, 35, 39) - )
+# Pairwise t-test
+pairwise.test <- df %>% t_test(len ~ dose)
+pairwise.test
+#> # A tibble: 3 x 10
+#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
+#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 ****
+#> 2 len 0.5 2 20 20 -11.8 36.9 4.40e-14 1.32e-13 ****
+#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 ****
+# Box plot
+ggboxplot(df, x = "dose", y = "len")+
+ stat_pvalue_manual(
+ pairwise.test, label = "p.adj",
+ y.position = c(29, 35, 39)
+ )
# Comparison against reference group -#:::::::::::::::::::::::::::::::::::::::: -# T-test: each level is compared to the ref group -stat.test <- df %>% t_test(len ~ dose, ref.group = "0.5") -stat.test -#> # A tibble: 2 x 10 -#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 1.27e- 7 **** -#> 2 len 0.5 2 20 20 -11.8 36.9 4.40e-14 8.80e-14 **** -# Box plot -ggboxplot(df, x = "dose", y = "len", ylim = c(0, 40)) + - stat_pvalue_manual( - stat.test, label = "p.adj.signif", - y.position = c(29, 35) - )
+# Comparison against reference group
+#::::::::::::::::::::::::::::::::::::::::
+# T-test: each level is compared to the ref group
+stat.test <- df %>% t_test(len ~ dose, ref.group = "0.5")
+stat.test
+#> # A tibble: 2 x 10
+#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
+#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 1.27e- 7 ****
+#> 2 len 0.5 2 20 20 -11.8 36.9 4.40e-14 8.80e-14 ****
+# Box plot
+ggboxplot(df, x = "dose", y = "len", ylim = c(0, 40)) +
+ stat_pvalue_manual(
+ stat.test, label = "p.adj.signif",
+ y.position = c(29, 35)
+ )
# Remove bracket -ggboxplot(df, x = "dose", y = "len", ylim = c(0, 40)) + - stat_pvalue_manual( - stat.test, label = "p.adj.signif", - y.position = c(29, 35), - remove.bracket = TRUE - )
+# Remove bracket
+ggboxplot(df, x = "dose", y = "len", ylim = c(0, 40)) +
+ stat_pvalue_manual(
+ stat.test, label = "p.adj.signif",
+ y.position = c(29, 35),
+ remove.bracket = TRUE
+ )
# T-test -stat.test <- df %>% t_test(len ~ dose, ref.group = "all") -stat.test -#> # A tibble: 3 x 10 -#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 len all 0.5 60 20 5.82 56.4 2.90e-7 8.70e-7 **** -#> 2 len all 1 60 20 -0.660 57.5 5.12e-1 5.12e-1 ns -#> 3 len all 2 60 20 -5.61 66.5 4.25e-7 8.70e-7 **** -# Box plot with horizontal mean line -ggboxplot(df, x = "dose", y = "len") + - stat_pvalue_manual( - stat.test, label = "p.adj.signif", - y.position = 35, - remove.bracket = TRUE - ) + - geom_hline(yintercept = mean(df$len), linetype = 2)
+# T-test
+stat.test <- df %>% t_test(len ~ dose, ref.group = "all")
+stat.test
+#> # A tibble: 3 x 10
+#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
+#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 len all 0.5 60 20 5.82 56.4 2.90e-7 8.70e-7 ****
+#> 2 len all 1 60 20 -0.660 57.5 5.12e-1 5.12e-1 ns
+#> 3 len all 2 60 20 -5.61 66.5 4.25e-7 8.70e-7 ****
+# Box plot with horizontal mean line
+ggboxplot(df, x = "dose", y = "len") +
+ stat_pvalue_manual(
+ stat.test, label = "p.adj.signif",
+ y.position = 35,
+ remove.bracket = TRUE
+ ) +
+ geom_hline(yintercept = mean(df$len), linetype = 2)
# One-way ANOVA test -#::::::::::::::::::::::::::::::::::::::::: -df %>% anova_test(len ~ dose) -#> ANOVA Table (type II tests) -#> -#> Effect DFn DFd F p p<.05 ges -#> 1 dose 2 57 67.416 9.53e-16 * 0.703 - -# Two-way ANOVA test -#::::::::::::::::::::::::::::::::::::::::: -df %>% anova_test(len ~ supp*dose) -#> ANOVA Table (type II tests) -#> -#> Effect DFn DFd F p p<.05 ges -#> 1 supp 1 54 15.572 2.31e-04 * 0.224 -#> 2 dose 2 54 92.000 4.05e-18 * 0.773 -#> 3 supp:dose 2 54 4.107 2.20e-02 * 0.132 - -# Two-way repeated measures ANOVA -#::::::::::::::::::::::::::::::::::::::::: -df$id <- rep(1:10, 6) # Add individuals id -# Use formula -# df %>% anova_test(len ~ supp*dose + Error(id/(supp*dose))) -# or use character vector -df %>% anova_test(dv = len, wid = id, within = c(supp, dose)) -#> ANOVA Table (type III tests) -#> -#> $ANOVA -#> Effect DFn DFd F p p<.05 ges -#> 1 supp 1 9 34.866 2.28e-04 * 0.224 -#> 2 dose 2 18 106.470 1.06e-10 * 0.773 -#> 3 supp:dose 2 18 2.534 1.07e-01 0.132 -#> -#> $`Mauchly's Test for Sphericity` -#> Effect W p p<.05 -#> 1 dose 0.807 0.425 -#> 2 supp:dose 0.934 0.761 -#> -#> $`Sphericity Corrections` -#> Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF] -#> 1 dose 0.838 1.68, 15.09 2.79e-09 * 1.008 2.02, 18.15 1.06e-10 -#> 2 supp:dose 0.938 1.88, 16.88 1.12e-01 1.176 2.35, 21.17 1.07e-01 -#> p[HF]<.05 -#> 1 * -#> 2 - -# Use model as arguments -#::::::::::::::::::::::::::::::::::::::::: -.my.model <- lm(yield ~ block + N*P*K, npk) -anova_test(.my.model) -#> ANOVA Table (type II tests) -#> -#> Effect DFn DFd F p p<.05 ges -#> 1 block 4 12 4.959 0.014 * 0.623 -#> 2 N 1 12 12.259 0.004 * 0.505 -#> 3 P 1 12 0.544 0.475 0.043 -#> 4 K 1 12 6.166 0.029 * 0.339 -#> 5 N:P 1 12 1.378 0.263 0.103 -#> 6 N:K 1 12 2.146 0.169 0.152 -#> 7 P:K 1 12 0.031 0.863 0.003 -#> 8 N:P:K 0 12 NA NA <NA> NA
# Data preparation -mydata <- mtcars %>% - select(mpg, disp, hp, drat, wt, qsec) -head(mydata, 3) -#> mpg disp hp drat wt qsec -#> Mazda RX4 21.0 160 110 3.90 2.620 16.46 -#> Mazda RX4 Wag 21.0 160 110 3.90 2.875 17.02 -#> Datsun 710 22.8 108 93 3.85 2.320 18.61 - -# Correlation test between two variables -mydata %>% cor_test(wt, mpg, method = "pearson") -#> # A tibble: 1 x 8 -#> var1 var2 cor statistic p conf.low conf.high method -#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 wt mpg -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson - -# Correlation of one variable against all -mydata %>% cor_test(mpg, method = "pearson") -#> # A tibble: 5 x 8 -#> var1 var2 cor statistic p conf.low conf.high method -#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 mpg disp -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson -#> 2 mpg hp -0.78 -6.74 1.79e- 7 -0.885 -0.586 Pearson -#> 3 mpg drat 0.68 5.10 1.78e- 5 0.436 0.832 Pearson -#> 4 mpg wt -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson -#> 5 mpg qsec 0.42 2.53 1.71e- 2 0.0820 0.670 Pearson - -# Pairwise correlation test between all variables -mydata %>% cor_test(method = "pearson") -#> # A tibble: 36 x 8 -#> var1 var2 cor statistic p conf.low conf.high method -#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 mpg mpg 1 Inf 0. 1 1 Pearson -#> 2 mpg disp -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson -#> 3 mpg hp -0.78 -6.74 1.79e- 7 -0.885 -0.586 Pearson -#> 4 mpg drat 0.68 5.10 1.78e- 5 0.436 0.832 Pearson -#> 5 mpg wt -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson -#> 6 mpg qsec 0.42 2.53 1.71e- 2 0.0820 0.670 Pearson -#> 7 disp mpg -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson -#> 8 disp disp 1 Inf 0. 1 1 Pearson -#> 9 disp hp 0.79 7.08 7.14e- 8 0.611 0.893 Pearson -#> 10 disp drat -0.71 -5.53 5.28e- 6 -0.849 -0.481 Pearson -#> # … with 26 more rows
# Compute correlation matrix -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -cor.mat <- mydata %>% cor_mat() -cor.mat -#> # A tibble: 6 x 7 -#> rowname mpg disp hp drat wt qsec -#> * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 mpg 1 -0.85 -0.78 0.68 -0.87 0.42 -#> 2 disp -0.85 1 0.79 -0.71 0.89 -0.43 -#> 3 hp -0.78 0.79 1 -0.45 0.66 -0.71 -#> 4 drat 0.68 -0.71 -0.45 1 -0.71 0.091 -#> 5 wt -0.87 0.89 0.66 -0.71 1 -0.17 -#> 6 qsec 0.42 -0.43 -0.71 0.091 -0.17 1 - -# Show the significance levels -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -cor.mat %>% cor_get_pval() -#> # A tibble: 6 x 7 -#> rowname mpg disp hp drat wt qsec -#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> -#> 1 mpg 0. 9.38e-10 0.000000179 0.0000178 1.29e- 10 0.0171 -#> 2 disp 9.38e-10 0. 0.0000000714 0.00000528 1.22e- 11 0.0131 -#> 3 hp 1.79e- 7 7.14e- 8 0 0.00999 4.15e- 5 0.00000577 -#> 4 drat 1.78e- 5 5.28e- 6 0.00999 0 4.78e- 6 0.62 -#> 5 wt 1.29e-10 1.22e-11 0.0000415 0.00000478 2.27e-236 0.339 -#> 6 qsec 1.71e- 2 1.31e- 2 0.00000577 0.62 3.39e- 1 0 - -# Replacing correlation coefficients by symbols -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -cor.mat %>% - cor_as_symbols() %>% - pull_lower_triangle() -#> rowname mpg disp hp drat wt qsec -#> 1 mpg -#> 2 disp * -#> 3 hp * * -#> 4 drat + + . -#> 5 wt * * + + -#> 6 qsec . . + - -# Mark significant correlations -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -cor.mat %>% - cor_mark_significant() -#> rowname mpg disp hp drat wt qsec -#> 1 mpg -#> 2 disp -0.85**** -#> 3 hp -0.78**** 0.79**** -#> 4 drat 0.68**** -0.71**** -0.45** -#> 5 wt -0.87**** 0.89**** 0.66**** -0.71**** -#> 6 qsec 0.42* -0.43* -0.71**** 0.091 -0.17 - - -# Draw correlogram using R base plot -#:::::::::::::::::::::::::::::::::::::::::::::::::::::::::: -cor.mat %>% - cor_reorder() %>% - pull_lower_triangle() %>% - cor_plot()
+# One-way ANOVA test
+#:::::::::::::::::::::::::::::::::::::::::
+df %>% anova_test(len ~ dose)
+#> ANOVA Table (type II tests)
+#>
+#> Effect DFn DFd F p p<.05 ges
+#> 1 dose 2 57 67.416 9.53e-16 * 0.703
+
+# Two-way ANOVA test
+#:::::::::::::::::::::::::::::::::::::::::
+df %>% anova_test(len ~ supp*dose)
+#> ANOVA Table (type II tests)
+#>
+#> Effect DFn DFd F p p<.05 ges
+#> 1 supp 1 54 15.572 2.31e-04 * 0.224
+#> 2 dose 2 54 92.000 4.05e-18 * 0.773
+#> 3 supp:dose 2 54 4.107 2.20e-02 * 0.132
+
+# Two-way repeated measures ANOVA
+#:::::::::::::::::::::::::::::::::::::::::
+df$id <- rep(1:10, 6) # Add individuals id
+# Use formula
+# df %>% anova_test(len ~ supp*dose + Error(id/(supp*dose)))
+# or use character vector
+df %>% anova_test(dv = len, wid = id, within = c(supp, dose))
+#> ANOVA Table (type III tests)
+#>
+#> $ANOVA
+#> Effect DFn DFd F p p<.05 ges
+#> 1 supp 1 9 34.866 2.28e-04 * 0.224
+#> 2 dose 2 18 106.470 1.06e-10 * 0.773
+#> 3 supp:dose 2 18 2.534 1.07e-01 0.132
+#>
+#> $`Mauchly's Test for Sphericity`
+#> Effect W p p<.05
+#> 1 dose 0.807 0.425
+#> 2 supp:dose 0.934 0.761
+#>
+#> $`Sphericity Corrections`
+#> Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF]
+#> 1 dose 0.838 1.68, 15.09 2.79e-09 * 1.008 2.02, 18.15 1.06e-10
+#> 2 supp:dose 0.938 1.88, 16.88 1.12e-01 1.176 2.35, 21.17 1.07e-01
+#> p[HF]<.05
+#> 1 *
+#> 2
+
+# Use model as arguments
+#:::::::::::::::::::::::::::::::::::::::::
+.my.model <- lm(yield ~ block + N*P*K, npk)
+anova_test(.my.model)
+#> ANOVA Table (type II tests)
+#>
+#> Effect DFn DFd F p p<.05 ges
+#> 1 block 4 12 4.959 0.014 * 0.623
+#> 2 N 1 12 12.259 0.004 * 0.505
+#> 3 P 1 12 0.544 0.475 0.043
+#> 4 K 1 12 6.166 0.029 * 0.339
+#> 5 N:P 1 12 1.378 0.263 0.103
+#> 6 N:K 1 12 2.146 0.169 0.152
+#> 7 P:K 1 12 0.031 0.863 0.003
+#> 8 N:P:K 0 12 NA NA <NA> NA
+# Data preparation
+mydata <- mtcars %>%
+ select(mpg, disp, hp, drat, wt, qsec)
+head(mydata, 3)
+#> mpg disp hp drat wt qsec
+#> Mazda RX4 21.0 160 110 3.90 2.620 16.46
+#> Mazda RX4 Wag 21.0 160 110 3.90 2.875 17.02
+#> Datsun 710 22.8 108 93 3.85 2.320 18.61
+
+# Correlation test between two variables
+mydata %>% cor_test(wt, mpg, method = "pearson")
+#> # A tibble: 1 x 8
+#> var1 var2 cor statistic p conf.low conf.high method
+#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 wt mpg -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson
+
+# Correlation of one variable against all
+mydata %>% cor_test(mpg, method = "pearson")
+#> # A tibble: 5 x 8
+#> var1 var2 cor statistic p conf.low conf.high method
+#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 mpg disp -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson
+#> 2 mpg hp -0.78 -6.74 1.79e- 7 -0.885 -0.586 Pearson
+#> 3 mpg drat 0.68 5.10 1.78e- 5 0.436 0.832 Pearson
+#> 4 mpg wt -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson
+#> 5 mpg qsec 0.42 2.53 1.71e- 2 0.0820 0.670 Pearson
+
+# Pairwise correlation test between all variables
+mydata %>% cor_test(method = "pearson")
+#> # A tibble: 36 x 8
+#> var1 var2 cor statistic p conf.low conf.high method
+#> <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 mpg mpg 1 Inf 0. 1 1 Pearson
+#> 2 mpg disp -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson
+#> 3 mpg hp -0.78 -6.74 1.79e- 7 -0.885 -0.586 Pearson
+#> 4 mpg drat 0.68 5.10 1.78e- 5 0.436 0.832 Pearson
+#> 5 mpg wt -0.87 -9.56 1.29e-10 -0.934 -0.744 Pearson
+#> 6 mpg qsec 0.42 2.53 1.71e- 2 0.0820 0.670 Pearson
+#> 7 disp mpg -0.85 -8.75 9.38e-10 -0.923 -0.708 Pearson
+#> 8 disp disp 1 Inf 0. 1 1 Pearson
+#> 9 disp hp 0.79 7.08 7.14e- 8 0.611 0.893 Pearson
+#> 10 disp drat -0.71 -5.53 5.28e- 6 -0.849 -0.481 Pearson
+#> # … with 26 more rows
+# Compute correlation matrix
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+cor.mat <- mydata %>% cor_mat()
+cor.mat
+#> # A tibble: 6 x 7
+#> rowname mpg disp hp drat wt qsec
+#> * <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
+#> 1 mpg 1 -0.85 -0.78 0.68 -0.87 0.42
+#> 2 disp -0.85 1 0.79 -0.71 0.89 -0.43
+#> 3 hp -0.78 0.79 1 -0.45 0.66 -0.71
+#> 4 drat 0.68 -0.71 -0.45 1 -0.71 0.091
+#> 5 wt -0.87 0.89 0.66 -0.71 1 -0.17
+#> 6 qsec 0.42 -0.43 -0.71 0.091 -0.17 1
+
+# Show the significance levels
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+cor.mat %>% cor_get_pval()
+#> # A tibble: 6 x 7
+#> rowname mpg disp hp drat wt qsec
+#> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
+#> 1 mpg 0. 9.38e-10 0.000000179 0.0000178 1.29e- 10 0.0171
+#> 2 disp 9.38e-10 0. 0.0000000714 0.00000528 1.22e- 11 0.0131
+#> 3 hp 1.79e- 7 7.14e- 8 0 0.00999 4.15e- 5 0.00000577
+#> 4 drat 1.78e- 5 5.28e- 6 0.00999 0 4.78e- 6 0.62
+#> 5 wt 1.29e-10 1.22e-11 0.0000415 0.00000478 2.27e-236 0.339
+#> 6 qsec 1.71e- 2 1.31e- 2 0.00000577 0.62 3.39e- 1 0
+
+# Replacing correlation coefficients by symbols
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+cor.mat %>%
+ cor_as_symbols() %>%
+ pull_lower_triangle()
+#> rowname mpg disp hp drat wt qsec
+#> 1 mpg
+#> 2 disp *
+#> 3 hp * *
+#> 4 drat + + .
+#> 5 wt * * + +
+#> 6 qsec . . +
+
+# Mark significant correlations
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+cor.mat %>%
+ cor_mark_significant()
+#> rowname mpg disp hp drat wt qsec
+#> 1 mpg
+#> 2 disp -0.85****
+#> 3 hp -0.78**** 0.79****
+#> 4 drat 0.68**** -0.71**** -0.45**
+#> 5 wt -0.87**** 0.89**** 0.66**** -0.71****
+#> 6 qsec 0.42* -0.43* -0.71**** 0.091 -0.17
+
+
+# Draw correlogram using R base plot
+#::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
+cor.mat %>%
+ cor_reorder() %>%
+ pull_lower_triangle() %>%
+ cor_plot()
Developed by Alboukadel Kassambara.
+ +Developed by Alboukadel Kassambara.
emmeans_test()
: restoring grouping variable class (factor
) in the final results emmeans_test()
(#169)emmeans_test()
: “Use of .data in tidyselect expressions was deprecated in tidyselect 1.2.0.”cor_plot()
now accepts additional arguments to pass to corrplot() (#66)car::Anova()
.cor_plot()
now accepts additional arguments to pass to corrplot() (#66)car::Anova()
.get_comparisons()
now drops unused levels before creating possible comparisons (#67)get_summary_stats()
keeps the order of columns specified by the user (#46).two_sample_test()
now counts group sizes (n1
and n2
) by the number of non-NA
values #104
+get_comparisons()
now drops unused levels before creating possible comparisons (#67)get_summary_stats()
keeps the order of columns specified by the user (#46).two_sample_test()
now counts group sizes (n1
and n2
) by the number of non-NA
values #104
shapiro_test()
function. Shapiro_test() throws an error if the input data contains column names “value” or “variable”. This is fixed now (#52).cor_test()
function, where there was a tidy evaluation conflict when the input data contains “x” and “y” as column names (#68).dunn_test()
documentation is updated to describe the discrepancy between the default behavior of the rstatix::dunn_test()
compared to other packages (dunn.test
and jamovi
). The default of the rstatix::dunn_test() function is to perform a two-sided Dunn test like the well known commercial softwares, such as SPSS and GraphPad. This is not the case for some other R packages (dunn.test and jamovi), where the default is to perform one-sided test (#50).get_summary_stats()
handles the user defined probabilities for grouped data (#78)get_n()
to extract sample count (n) from statistical test results. - get_description
to extract stat test description or name - remove_ns()
to remove non-significant rows.add_x_position()
to better support different situations (#73).dunn_test()
include estimate1
and estimate2
when the argument detailed = TRUE
is specified. The estimate1
and estimate2
values represent the mean rank values of the two groups being compared, respectively (#59).shapiro_test()
function. Shapiro_test() throws an error if the input data contains column names “value” or “variable”. This is fixed now (#52).cor_test()
function, where there was a tidy evaluation conflict when the input data contains “x” and “y” as column names (#68).dunn_test()
documentation is updated to describe the discrepancy between the default behavior of the rstatix::dunn_test()
compared to other packages (dunn.test
and jamovi
). The default of the rstatix::dunn_test() function is to perform a two-sided Dunn test like the well known commercial softwares, such as SPSS and GraphPad. This is not the case for some other R packages (dunn.test and jamovi), where the default is to perform one-sided test (#50).get_summary_stats()
handles the user defined probabilities for grouped data (#78)get_n()
to extract sample count (n) from statistical test results. - get_description
to extract stat test description or name - remove_ns()
to remove non-significant rows.add_x_position()
to better support different situations (#73).dunn_test()
include estimate1
and estimate2
when the argument detailed = TRUE
is specified. The estimate1
and estimate2
values represent the mean rank values of the two groups being compared, respectively (#59).cor_spread()
doc updated, error is explicitly shown if the input data doesn’t contain the columns “var1”, “var2” and “cor” (#95)emmeans_test()
and levene_test()
to adapt to broom release 0.7.4 (#89)anova_test()
is updated to explain the internal contrast setting (#74).p_mark_significance()
works when all p-values are NA. Empty character ("") is returned for NA (#64).rstatix
and grouped_anova_test
) added to grouped ANOVA test (#61)scales
added in the function get_y_position()
. If the specified value is “free” or “free_y”, then the step increase of y positions will be calculated by plot panels. Note that, using “free” or “free_y” gives the same result. A global step increase is computed when scales = “fixed” (#56).anova_test()
computes now repeated measures ANOVA without error when unused columns are present in the input data frame (#55)stack
added in get_y_position()
to compute p-values y position for stacked bar plots (#48).cor_spread()
doc updated, error is explicitly shown if the input data doesn’t contain the columns “var1”, “var2” and “cor” (#95)emmeans_test()
and levene_test()
to adapt to broom release 0.7.4 (#89)anova_test()
is updated to explain the internal contrast setting (#74).p_mark_significance()
works when all p-values are NA. Empty character ("") is returned for NA (#64).rstatix
and grouped_anova_test
) added to grouped ANOVA test (#61)scales
added in the function get_y_position()
. If the specified value is “free” or “free_y”, then the step increase of y positions will be calculated by plot panels. Note that, using “free” or “free_y” gives the same result. A global step increase is computed when scales = “fixed” (#56).anova_test()
computes now repeated measures ANOVA without error when unused columns are present in the input data frame (#55)stack
added in get_y_position()
to compute p-values y position for stacked bar plots (#48).wilcox_test()
: Now, if detailed = TRUE
, an estimate of the location parameter (Only present if argument detailed = TRUE). This corresponds to the pseudomedian (for one-sample case) or to the difference of the location parameter (for two-samples case) (#45).anova_test()
function: Changing R default contrast setting (contr.treatment
) into orthogonal contrasts (contr.sum
) to have comparable results to SPSS when users define the model using formula (@benediktclaus, #40).type = "quantile"
of get_summary_stats()
works properly (@Boyoron, #39).rstatix
and the ggpubr
package and makes it easy to program with tidyverse packages using non standard evaluation. - df_select - df_arrange - df_group_by - df_nest_by - df_split_by - df_unite - df_get_var_names - df_label_both - df_label_valuefreq_table()
the option na.rm
removes only missing values in the variables used to create the frequency table (@JuhlinF, #25).anova_test()
(@benediktclaus, #31)wilcox_test()
: Now, if detailed = TRUE
, an estimate of the location parameter (Only present if argument detailed = TRUE). This corresponds to the pseudomedian (for one-sample case) or to the difference of the location parameter (for two-samples case) (#45).anova_test()
function: Changing R default contrast setting (contr.treatment
) into orthogonal contrasts (contr.sum
) to have comparable results to SPSS when users define the model using formula (@benediktclaus, #40).type = "quantile"
of get_summary_stats()
works properly (@Boyoron, #39).rstatix
and the ggpubr
package and makes it easy to program with tidyverse packages using non standard evaluation. - df_select - df_arrange - df_group_by - df_nest_by - df_split_by - df_unite - df_get_var_names - df_label_both - df_label_valuefreq_table()
the option na.rm
removes only missing values in the variables used to create the frequency table (@JuhlinF, #25).anova_test()
(@benediktclaus, #31)games_howell_test()
function : the t-statistic is now calculated using the absolute mean difference between groups (@GegznaV, #37).cohens_d()
function now supports Hedge’s correction. New argument hedge.correction
added . logical indicating whether apply the Hedges correction by multiplying the usual value of Cohen’s d by (N-3)/(N-2.25)
(for unpaired t-test) and by (n1-2)/(n1-1.25)
for paired t-test; where N is the total size of the two groups being compared (N = n1 + n2) (@IndrajeetPatil, #9).cohens_d()
outputs values with directionality. The absolute value is no longer returned. It can now be positive or negative depending on the data (@narunpat, #9).mu
is now considered when calculating cohens_d()
for one sample t-test (@mllewis, #22).tukey_hsd()
now handles situation where minus -
symbols are present in factor levels (@IndrajeetPatil, #19).games_howell_test()
function : the t-statistic is now calculated using the absolute mean difference between groups (@GegznaV, #37).cohens_d()
function now supports Hedge’s correction. New argument hedge.correction
added . logical indicating whether apply the Hedges correction by multiplying the usual value of Cohen’s d by (N-3)/(N-2.25)
(for unpaired t-test) and by (n1-2)/(n1-1.25)
for paired t-test; where N is the total size of the two groups being compared (N = n1 + n2) (@IndrajeetPatil, #9).cohens_d()
outputs values with directionality. The absolute value is no longer returned. It can now be positive or negative depending on the data (@narunpat, #9).mu
is now considered when calculating cohens_d()
for one sample t-test (@mllewis, #22).tukey_hsd()
now handles situation where minus -
symbols are present in factor levels (@IndrajeetPatil, #19).identify_outliers
returns a basic data frame instead of tibble when nrow = 0 (for nice printing)detailed
added in dunn_test()
. If TRUE, then estimate and method columns are shown in the results.prop_test()
, pairwise_prop_test()
and row_wise_prop_test()
. Performs one-sample and two-samples z-test of proportions. Wrappers around the R base function prop.test()
but have the advantage of performing pairwise and row-wise z-test of two proportions, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.prop_test()
, pairwise_prop_test()
and row_wise_prop_test()
. Performs one-sample and two-samples z-test of proportions. Wrappers around the R base function prop.test()
but have the advantage of performing pairwise and row-wise z-test of two proportions, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.fisher_test()
, pairwise_fisher_test()
and row_wise_fisher_test()
: Fisher’s exact test for count data. Wrappers around the R base function fisher.test()
but have the advantage of performing pairwise and row-wise fisher tests, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.fisher_test()
, pairwise_fisher_test()
and row_wise_fisher_test()
: Fisher’s exact test for count data. Wrappers around the R base function fisher.test()
but have the advantage of performing pairwise and row-wise fisher tests, the post-hoc tests following a significant chi-square test of homogeneity for 2xc and rx2 contingency tables.
chisq_test()
, pairwise_chisq_gof_test()
, pairwise_chisq_test_against_p()
: Chi-square test for count data.mcnemar_test()
and cochran_qtest()
for comparing two ore more related proportions.prop_trend_test()
: Performs chi-squared test for trend in proportion. This test is also known as Cochran-Armitage trend test.get_test_label()
and get_pwc_label()
return expression by defaultget_test_label()
and get_pwc_label()
return expression by defaultget_anova_table()
supports now an object of class grouped_anova_test
correction = "none"
for repeated measures ANOVANAs
are now automatically removed before quantile computation for identifying outliers (@IndrajeetPatil, #10).NAs
are now automatically removed before quantile computation for identifying outliers (@IndrajeetPatil, #10).
set_ref_level()
, reorder_levels()
and make_valid_levels()
model
added in the function emmeans_test()
welch_anova_test()
: Welch one-Way ANOVA test. A wrapper around the base function stats::oneway.test()
. This is is an alternative to the standard one-way ANOVA in the situation where the homogeneity of variance assumption is violated.welch_anova_test()
: Welch one-Way ANOVA test. A wrapper around the base function stats::oneway.test()
. This is is an alternative to the standard one-way ANOVA in the situation where the homogeneity of variance assumption is violated.friedman_effsize()
, computes the effect size of Friedman test using the Kendall’s W value.friedman_test()
, provides a pipe-friendly framework to perform a Friedman rank sum test, which is the non-parametric alternative to the one-way repeated measures ANOVA test.games_howell_test()
: Performs Games-Howell test, which is used to compare all possible combinations of group differences when the assumption of homogeneity of variances is violated.get_anova_label()
emmeans_test()
added for pairwise comparisons of estimated marginal means.comparison
removed from tukey_hsd()
results (breaking change).n
(sample count) added to statistical tests results: t_test()
, wilcox_test()
, sign_test()
, dunn_test()
and kruskal_test()
(@ShixiangWang, #4).comparison
removed from tukey_hsd()
results (breaking change).n
(sample count) added to statistical tests results: t_test()
, wilcox_test()
, sign_test()
, dunn_test()
and kruskal_test()
(@ShixiangWang, #4).rstatix_test
class added to anova_test()
resultskruskal_test()
is now an object of class rstatix_test
that has an attribute named args for holding the test arguments.tukey_hsd()
results.adjust_pvalue()
now supports grouped dataget_pvalue_position
added to autocompute p-value positions for plotting significance using ggplot2.get_comparisons()
added to create a list of possible pairwise comparisons between groups.dunn_test()
added for multiple pairwise comparisons following Kruskal-Wallis test.sign_test()
added.get_summary_stats()
now supports type = “min”, “max”, “mean” or “median”t_test()
, wilcox_test()
, dunn_test()
and sign_test()
are now an object of class rstatix_test
that has an attribute named args for holding the test arguments.cohens_d()
is now a data frame containing the Cohen’s d and the magnitude.R/add_significance.R
+ Source: R/add_significance.R
add_significance.Rd
Add p-value significance symbols into a data frame.
add_significance( - data, - p.col = NULL, - output.col = NULL, - cutpoints = c(0, 1e-04, 0.001, 0.01, 0.05, 1), - symbols = c("****", "***", "**", "*", "ns") -)- -
data | -a data frame containing a p-value column. |
-
---|---|
p.col | -column name containing p-values. |
-
output.col | -the output column name to hold the adjusted p-values. |
-
cutpoints | -numeric vector used for intervals. |
-
symbols | -character vector, one shorter than cutpoints, used as -significance symbols. |
-
a data frame
- -+# Perform pairwise comparisons and adjust p-values -ToothGrowth %>% - t_test(len ~ dose) %>% - adjust_pvalue() %>% - add_significance("p.adj")#> # A tibble: 3 x 10 -#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 **** -#> 2 len 0.5 2 20 20 -11.8 36.9 4.4 e-14 1.32e-13 **** -#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 ****-
a data frame containing a p-value column.
column name containing p-values.
the output column name to hold the adjusted p-values.
numeric vector used for intervals.
character vector, one shorter than cutpoints, used as +significance symbols.
a data frame
+# Perform pairwise comparisons and adjust p-values
+ToothGrowth %>%
+ t_test(len ~ dose) %>%
+ adjust_pvalue() %>%
+ add_significance("p.adj")
+#> # A tibble: 3 × 10
+#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
+#> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 ****
+#> 2 len 0.5 2 20 20 -11.8 36.9 4.4 e-14 1.32e-13 ****
+#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 ****
+
+
R/adjust_pvalue.R
+ Source: R/adjust_pvalue.R
adjust_pvalue.Rd
adjust_pvalue(data, p.col = NULL, output.col = NULL, method = "holm")- -
data | -a data frame containing a p-value column |
-
---|---|
p.col | -column name containing p-values |
-
output.col | -the output column name to hold the adjusted p-values |
-
method | -method for adjusting p values (see
-
+
+
+
+ Arguments+
|
-
a data frame
- -+adjust the p value (not recommended), use p.adjust.method = "none". + +# Perform pairwise comparisons and adjust p-values -ToothGrowth %>% - t_test(len ~ dose) %>% - adjust_pvalue()#> # A tibble: 3 x 10 -#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif -#> * <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr> -#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 **** -#> 2 len 0.5 2 20 20 -11.8 36.9 4.4 e-14 1.32e-13 **** -#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 ****-
a data frame
+# Perform pairwise comparisons and adjust p-values
+ToothGrowth %>%
+ t_test(len ~ dose) %>%
+ adjust_pvalue()
+#> # A tibble: 3 × 10
+#> .y. group1 group2 n1 n2 statistic df p p.adj p.adj.signif
+#> <chr> <chr> <chr> <int> <int> <dbl> <dbl> <dbl> <dbl> <chr>
+#> 1 len 0.5 1 20 20 -6.48 38.0 1.27e- 7 2.54e- 7 ****
+#> 2 len 0.5 2 20 20 -11.8 36.9 4.4 e-14 1.32e-13 ****
+#> 3 len 1 2 20 20 -4.90 37.1 1.91e- 5 1.91e- 5 ****
+
+
R/anova_summary.R
+ Source: R/anova_summary.R
anova_summary.Rd
Create beautiful summary tables of ANOVA test results obtained
- from either Anova()
or aov()
.
Anova()
or aov()
.
The results include ANOVA table, generalized effect size and some assumption checks.
anova_summary(object, effect.size = "ges", detailed = FALSE, observed = NULL)+
anova_summary(object, effect.size = "ges", detailed = FALSE, observed = NULL)
object | -- |
---|---|
effect.size | -the effect size to compute and to show in the ANOVA + the effect size to compute and to show in the ANOVA results. Allowed values can be either "ges" (generalized eta squared) or -"pes" (partial eta squared) or both. Default is "ges". |
-
detailed | -If TRUE, returns extra information (sums of squares columns, -intercept row, etc.) in the ANOVA table. |
-
observed | -Variables that are observed (i.e, measured) as compared to +"pes" (partial eta squared) or both. Default is "ges". + + +If TRUE, returns extra information (sums of squares columns, +intercept row, etc.) in the ANOVA table. Variables that are observed (i.e, measured) as compared to experimentally manipulated. The default effect size reported (generalized -eta-squared) requires correct specification of the observed variables. |
-
return an object of class anova_test
a data frame containing
+
return an object of class anova_test
a data frame containing
the ANOVA table for independent measures ANOVA. However, for repeated/mixed
measures ANOVA, it is a list containing the following components are
returned:
ANOVA: a data frame containing ANOVA results
ANOVA: a data frame containing ANOVA results
Mauchly's Test for Sphericity: If any within-Ss variables with more than 2 levels are present, a data frame containing the results of Mauchly's test for Sphericity. Only reported for effects that have more than 2 levels @@ -172,16 +103,16 @@
Sphericity Corrections: If any within-Ss variables are present, a data frame containing the Greenhouse-Geisser and Huynh-Feldt epsilon values, and corresponding corrected p-values.
The returned object might have an attribute called args
if
- you compute ANOVA using the function anova_test()
. The attribute args
is a
+
The returned object might have an attribute called args
if
+ you compute ANOVA using the function anova_test()
. The attribute args
is a
list holding the arguments used to fit the ANOVA model, including: data, dv,
within, between, type, model, etc.
The following abbreviations are used in the different results tables:
-DFn Degrees of Freedom in the numerator (i.e. DF effect).
DFn Degrees of Freedom in the numerator (i.e. DF effect).
DFd Degrees of Freedom in the denominator (i.e., DF error).
SSn Sum of Squares in the numerator (i.e., SS effect).
SSd Sum of @@ -203,103 +134,117 @@
p[HFe]<.05 Highlights p-values (after correction using Huynh-Feldt epsilon) less than the traditional alpha level of .05.
W Mauchly's W statistic
# Load data -#::::::::::::::::::::::::::::::::::::::: -data("ToothGrowth") -df <- ToothGrowth -df$dose <- as.factor(df$dose) ++++See also
+ ++-# Independent measures ANOVA -#::::::::::::::::::::::::::::::::::::::::: -# Compute ANOVA and display the summary -res.anova <- Anova(lm(len ~ dose*supp, data = df)) -anova_summary(res.anova)Author
+Alboukadel Kassambara, alboukadel.kassambara@gmail.com
+
# Load data
+#:::::::::::::::::::::::::::::::::::::::
+data("ToothGrowth")
+df <- ToothGrowth
+df$dose <- as.factor(df$dose)
+
+# Independent measures ANOVA
+#:::::::::::::::::::::::::::::::::::::::::
+# Compute ANOVA and display the summary
+res.anova <- Anova(lm(len ~ dose*supp, data = df))
+anova_summary(res.anova)
+#> Effect DFn DFd F p p<.05 ges
+#> 1 dose 2 54 92.000 4.05e-18 * 0.773
+#> 2 supp 1 54 15.572 2.31e-04 * 0.224
+#> 3 dose:supp 2 54 4.107 2.20e-02 * 0.132
+
+# Display both SSn and SSd using detailed = TRUE
+# Show generalized eta squared using effect.size = "ges"
+anova_summary(res.anova, detailed = TRUE, effect.size = "ges")
+#> Effect SSn SSd DFn DFd F p p<.05 ges
+#> 1 dose 2426.434 712.106 2 54 92.000 4.05e-18 * 0.773
+#> 2 supp 205.350 712.106 1 54 15.572 2.31e-04 * 0.224
+#> 3 dose:supp 108.319 712.106 2 54 4.107 2.20e-02 * 0.132
+
+# Show partial eta squared using effect.size = "pes"
+anova_summary(res.anova, detailed = TRUE, effect.size = "pes")
+#> Effect SSn SSd DFn DFd F p p<.05 pes
+#> 1 dose 2426.434 712.106 2 54 92.000 4.05e-18 * 0.773
+#> 2 supp 205.350 712.106 1 54 15.572 2.31e-04 * 0.224
+#> 3 dose:supp 108.319 712.106 2 54 4.107 2.20e-02 * 0.132
+
+# Repeated measures designs using car::Anova()
+#:::::::::::::::::::::::::::::::::::::::::
+# Prepare the data
+df$id <- as.factor(rep(1:10, 6)) # Add individuals ids
+head(df)
+#> len supp dose id
+#> 1 4.2 VC 0.5 1
+#> 2 11.5 VC 0.5 2
+#> 3 7.3 VC 0.5 3
+#> 4 5.8 VC 0.5 4
+#> 5 6.4 VC 0.5 5
+#> 6 10.0 VC 0.5 6
+
+# Easily perform repeated measures ANOVA using the car package
+design <- factorial_design(df, dv = len, wid = id, within = c(supp, dose))
+res.anova <- Anova(design$model, idata = design$idata, idesign = design$idesign, type = 3)
+anova_summary(res.anova)
+#> $ANOVA
+#> Effect DFn DFd F p p<.05 ges
+#> 1 supp 1 9 34.866 2.28e-04 * 0.224
+#> 2 dose 2 18 106.470 1.06e-10 * 0.773
+#> 3 supp:dose 2 18 2.534 1.07e-01 0.132
+#>
+#> $`Mauchly's Test for Sphericity`
+#> Effect W p p<.05
+#> 1 dose 0.807 0.425
+#> 2 supp:dose 0.934 0.761
+#>
+#> $`Sphericity Corrections`
+#> Effect GGe DF[GG] p[GG] p[GG]<.05 HFe DF[HF] p[HF]
+#> 1 dose 0.838 1.68, 15.09 2.79e-09 * 1.008 2.02, 18.15 1.06e-10
+#> 2 supp:dose 0.938 1.88, 16.88 1.12e-01 1.176 2.35, 21.17 1.07e-01
+#> p[HF]<.05
+#> 1 *
+#> 2
+#>
+
+# Repeated measures designs using stats::Aov()
+#:::::::::::::::::::::::::::::::::::::::::
+res.anova <- aov(len ~ dose*supp + Error(id/(supp*dose)), data = df)
+anova_summary(res.anova)
+#> Effect DFn DFd F p p<.05 ges
+#> 1 supp 1 9 34.866 2.28e-04 * 0.242
+#> 2 dose 2 18 106.470 1.06e-10 * 0.791
+#> 3 dose:supp 2 18 2.534 1.07e-01 0.144
+