Skip to content

Latest commit

 

History

History
2518 lines (2368 loc) · 89.4 KB

README.md

File metadata and controls

2518 lines (2368 loc) · 89.4 KB

xeredar

R-CMD-check R-CMD-check

Background

The package xeredar is an R-package for analysis of the New Approach Methodology (NAM) assays of XETA (Xenopus Eleutheroembryonic Thyroid), RADAR (Rapid Androgen Disruption Activity Reporter) and REACTIV (Rapid Estrogen ACTivity In Vivo) for assessing endocrine effects of chemicals on the thyroid, androgen/steroid and estrogen axis. The functionality is based on the SAS-script recommended in the Annex 13 of OECD test guideline No. 248 of the XETA assay (2019), written by John Green.

Installation

You can install xeredar using one of the followig commands:

devtools::install_github("basf/xeredar")

pak::pkg_install("basf/xeredar")

Data requirements

Data frames that are supposed to be analyzed with xeredar needs to fulfill certain requirements. The data frame or tibble needs to contain the following column headers:

knitr::kable(head(xeredar::testDataSpiked))
Replicate Treatment Row Fluor Conc
11 1 0 + T3 4 19.768 0
12 1 0 + T3 4 27.928 0
13 1 0 + T3 4 29.592 0
14 1 0 + T3 4 22.816 0
15 1 0 + T3 4 26.080 0
16 1 0 + T3 4 25.332 0

The type of each column should be accordingly:

knitr::kable(purrr::map_df(xeredar::testDataSpiked, class))
Replicate Treatment Row Fluor Conc
factor character character numeric ordered
factor character character numeric factor

Replicate (i.e. run), Treatment (i.e. a unique name for each treatment level in either spiked or unspiked mode) and Row (i.e. exposure vessel) can either be factor or character columns, but Fluor (i.e. measured fluorescence) must always be numeric and Conc (i.e. concentration of test item) must always contain ordered factors. The order of the columns is not relevant.

Replicate, Treatment and Row can either be factor or character columns, but Fluor must always be numeric and Conc must always contain ordered factors. The order of the columns is not relevant. It is important that the decimal separator is a period instead of a comma.

When simply aiming to use the data_prep() function, the data frame needs to either contain spiked treatments or unspiked treatments. When still having spiked and unspiked treatments in one data frame they should be separated. Imagine you have a XETA data frame (dat) which contains spiked and unspiked treatments as well as the T4 positive control. The T3 or T4 additions are designated by “T3” and “T4” in the Treatment column. The spiked and unspiked datasets could quickly be subset using the following code:

datSpiked <- dat[grepl("T3",dat$Treatment),] # spiked data 

datUnspiked <- setdiff(dat,datSpiked)
datUnspiked <- datUnspiked[which(datUnspiked$Treatment != "T4"),] # unspiked data

Running default XETA analysis

To demonstrate how to run XETA analysis, we will use one of the data sets by the French lab containing the spiked and unspiked measurements from the XETA ring test as included in OECD test guideline No. 248 of the XETA assay (2019).

xeta_spiked <- xeredar::valid_data_xeta[["ptu_france_spiked"]]
xeta_unspiked <- xeredar::valid_data_xeta[["ptu_france_unspiked"]]

The default XETA analysis can be run using the data_prep() function with either spiked or unspiked data. This function automatically decides whether trimming, outlier removal and/or transformations are conducted following the manuscript (Spyridonov et al. unpublished). The actual analysis is carried out by the ana() function. The ana() function is called by the data_prep() function and does not need to be called separately. For this dataset, the exposure well ID (Row of the 96 well plate) is not recorded, therefore, we set the row argument to FALSE. In this case, we use the reduced mixed ANOVA model where the exposure well ID is not included as a random effect. Please specify row=TRUE if the exposure well ID is recorded and you want to use the full mixed ANOVA model.

xeta_spiked_result <- xeredar::data_prep(dataframe = xeta_spiked, row = FALSE)
xeta_unspiked_result <- xeredar::data_prep(dataframe = xeta_unspiked, row= FALSE)

Here we use the spike data as an example to demonstrate the output of the data_prep() function.

The outputs of the data_prep() function are lists containing the following elements:

  • A reasoning for the recommended transformation and trimming. The raw data should be used for the analysis because the residuals of the mixed ANOVA are normally distributed and show homogeneous variances among treatment groups.
xeta_spiked_result$Justify
#> [1] "The raw data should be used for the analysis because\n                  the residuals of the mixed ANOVA are normally\n                  distributed and show homogeneous variances\n                  among treatment groups."
  • A data frame of the processed data (e.g. raw data, trimmed, transformed, or outlier removed) used for actual statistical testing following the reasoning. The box plots of the processed data per run/replicate (i.e. each panel represents each run/replicate) are also provided for visual inspection.
knitr::kable(head(xeta_spiked_result$ProcessedData))
Replicate Treatment Fluor Conc Country Substance Spiked
1 FETAXT3 4989.533 0 france ptu TRUE
1 FETAXT3 5002.533 0 france ptu TRUE
1 FETAXT3 6331.533 0 france ptu TRUE
1 FETAXT3 4645.533 0 france ptu TRUE
1 FETAXT3 4977.533 0 france ptu TRUE
1 FETAXT3 6229.533 0 france ptu TRUE
xeta_spiked_result$BoxPlots

  • Summary tables of the processed data (per replicate and overall)
knitr::kable(xeta_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Summary statistics of fluorescence in different concentrations of test item per replicate
Conc Replicate N Mean Standard deviation Coefficient of variation
0 1 19 4651.060 944.7420 0.2031240
0 2 20 4024.300 603.6731 0.1500070
0 3 19 4952.225 1179.3597 0.2381475
1 1 19 3854.137 828.3944 0.2149364
1 2 20 4557.867 558.4921 0.1225337
1 3 19 4720.819 619.8992 0.1313118
3 1 19 4671.428 976.9674 0.2091368
3 2 19 4486.538 783.8641 0.1747147
3 3 20 5421.433 703.9490 0.1298456
10 1 20 4573.083 946.2029 0.2069070
10 2 19 4329.538 827.4898 0.1911266
10 3 19 5106.849 876.5940 0.1716507
30 1 20 5391.183 809.5849 0.1501683
30 2 19 4556.275 694.1698 0.1523547
30 3 19 5164.691 763.8837 0.1479050
100 1 19 5181.060 858.5454 0.1657085
100 2 19 5124.381 733.4611 0.1431317
100 3 20 6048.083 959.5900 0.1586602
knitr::kable(xeta_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Summary statistics of fluorescence in different concentrations of test item of all replicates
Conc N Mean Standard deviation Coefficient of variation
0 58 4533.593 998.2901 0.2201984
1 58 4380.715 764.2147 0.1744498
3 58 4869.483 910.7568 0.1870336
10 58 4668.156 928.9066 0.1989879
30 58 5043.483 825.4425 0.1636652
100 58 5461.466 945.7370 0.1731654
  • Tables of results evaluated using increasing/decreasing Williams test and/or Dunnett’s test, if applicable.

In the the Williams’ test result tables, Y.Tilde is the amalgamated mean of the fluorescence in each treatment group, Y0 is the mean of the control fluorescence, DIFF is the estimated difference between the treatment and the control, SE_DIFF is the standard error of the Williams’ test, DF is the degrees of freedom for Williams’ test, WILL Incr or Will Decr is the Williams’ test statistic, crit Val is the critical value of Williams distribution, Sign suggests if there is significant difference between the treatment and the control, and %Incr is the percent increase of the fluorescence compared to the control.

In the Dunnett’s test result table, Estimate is the estimated difference between the treatment and the control, SE is the standard error of the mixed ANOVA model, t value is the Dunnett’s test statistic, adj p is the adjusted p value and %Incr is the percent increase of the fluorescence compared to the control.

knitr::kable(xeta_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Increasing Williams’ test
Conc Y.Tilde Y0 DIFF Incr SE_DIFF DF WILL Incr crit Val Sign % Incr
Conc100 - Conc0 100 5461.47 4533.593 927.8769 249.7868 10 3.7146758 1.971 TRUE 20.466613
Conc30 - Conc0 30 5043.48 4533.593 509.8869 249.7868 10 2.0412886 1.965 TRUE 11.246933
Conc10 - Conc0 10 4768.82 4533.593 235.2269 249.7868 10 0.9417108 1.956 FALSE 2.968123
Conc3 - Conc0 3 4768.82 4533.593 235.2269 249.7868 10 0.9417108 1.940 FALSE 7.408918
Conc1 - Conc0 1 4380.72 4533.593 -152.8731 249.7845 10 -0.6120201 1.908 FALSE -3.372111
knitr::kable(xeta_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Decreasing Williams’ test
Conc Y.Tilde Y0 DIFF Decr SE_DIFF DF WILL Decr crit Val Sign % Incr
Conc100 - Conc0 100 4884.66 4533.593 -351.0669 249.7868 10 -1.405466 1.971 FALSE 20.466613
Conc30 - Conc0 30 4884.66 4533.593 -351.0669 249.7868 10 -1.405466 1.965 FALSE 11.246933
Conc10 - Conc0 10 4884.66 4533.593 -351.0669 249.7868 10 -1.405466 1.956 FALSE 2.968123
Conc3 - Conc0 3 4884.66 4533.593 -351.0669 249.7868 10 -1.405466 1.940 FALSE 7.408918
Conc1 - Conc0 1 4884.66 4533.593 -351.0669 249.7845 10 -1.405479 1.908 FALSE -3.372111
knitr::kable(xeta_spiked_result$Dunnetts, caption="Dunnett's test")
Dunnett’s test
Estimate SE df t value adj p % Incr
Conc1 - Conc0 -160.4067 249.7845 10 -0.6421806 0.9458404 -3.372111
Conc3 - Conc0 320.1063 249.7868 10 1.2815181 0.5970718 7.408918
Conc10 - Conc0 128.8281 249.7868 10 0.5157522 0.9769722 2.968123
Conc30 - Conc0 499.2993 249.7868 10 1.9989021 0.2367993 11.246933
Conc100 - Conc0 911.7088 249.7868 10 3.6499482 0.0174131 20.466613
  • Further information about the normality test (Shapiro-Wilk), the homogeneity of variance test (Levene’s test) of residuals of the mixed ANOVA model, the monotonicity test and the model fit.
xeta_spiked_result$NormalityTest
#> 
#>  Shapiro-Wilk normality test
#> 
#> data:  stats::resid(mixedaov)
#> W = 0.99186, p-value = 0.05343
xeta_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#>   statistic p.value    df df.residual
#>       <dbl>   <dbl> <int>       <int>
#> 1     0.759   0.580     5         342
xeta_spiked_result$`Monotonicity Test`
#>        Test t value Pr(>|t|) Significance
#> 1    Linear    6.49  <0.0001          ***
#> 2 Quadratic    2.14   0.0335            *
xeta_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: 
#> Fluor ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#>    Data: dataframe
#> REML criterion at convergence: 5608.294
#> Random effects:
#>  Groups         Name        Std.Dev.
#>  Replicate:Conc (Intercept) 241.1   
#>  Replicate      (Intercept) 350.5   
#>  Residual                   827.7   
#> Number of obs: 348, groups:  
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept)       Conc.L       Conc.Q       Conc.C  
#>      4824.2        758.5        264.5         52.6  
#>      Conc^4       Conc^5  
#>       149.8       -270.8

The list output from running the data_prep() function can be summarized with the data_summary() function.

xeredar::data_summary(xeta_spiked_result) |> knitr::kable()
1 3 10 30 100
Replicate 1 -17.13 0.44 -1.68 15.91 11.4
Replicate 2 13.26 11.49 7.58 13.22 27.34
Replicate 3 -4.67 9.47 3.12 4.29 22.13
Pooled -3.37 7.41 2.97 11.25 20.47
Dunnett ns ns ns ns *
IncreasingWilliams ns ns ns * *
DecreasingWilliams ns ns ns ns ns

Running default RADAR analysis

To demonstrate how to run RADAR analysis, we will use one of the data sets by the Pos_mDHT_Fraunhofer_RADAR containing the spiked and unspiked measurements from the RADAR study validation in the lab Fraunhofer with an androgen axis active chemical.

radar_spiked <- xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Spiked"]]
radar_unspiked <-  xeredar::RADAR_valid_data_table_spiked_unspiked[["mDHTFRAUNH_Unspiked"]]

The default radar analysis can be run using the data_prep() function with either spiked or unspiked data. This function automatically decides whether trimming, outlier removal and/or transformations are conducted following the manuscript (Spyridonov et al. unpublished). The actual analysis is carried out by the ana() function. The ana() function is called by the data_prep() function and does not need to be called separately. the analysis the RADAR assay follows the description of the Method 2 (the mixed ANOVA approach) in the Annex 8: methods for the statistical analysis of RADAR assay data of the OECD TG 251 (2022). For this dataset, the exposure well ID (Row of the 96 well plate) is not recorded, therefore, we set the row argument to FALSE. In this case, we use the reduced mixed ANOVA model where the exposure well ID is not included as a random effect. Please specify row=TRUE if the exposure well ID is recorded and you want to use the full mixed ANOVA model. Trimming is not required.

radar_spiked_result <- xeredar::data_prep(dataframe = radar_spiked, row = FALSE, trimming=FALSE)
radar_unspiked_result <- xeredar::data_prep(dataframe = radar_unspiked, row= FALSE, trimming=FALSE)

Here we use the spike data as an example to demonstrate the output of the data_prep() function.

The outputs of the data_prep() function are lists containing the following elements:

  • A reasoning for the recommended transformation and trimming. The raw data (without trimming or outlier removal) where the fluorescence values are log transformed should be used for the analysis because only after log transformation, the residuals of the mixed ANOVA are normally distributed and have homogeneous variances among treatment groups.
radar_spiked_result$Justify
#> [1] "The raw data (without trimming or outlier removal) where\n                    the fluorescence values are log transformed should be used\n                    for the analysis because only after log transformation,\n                    the residuals of the mixed ANOVA are normally distributed\n                    and have homogeneous variances among treatment groups."
  • A data frame of the processed data (e.g. raw data, transformed, or outlier removed) used for actual statistical testing following the reasoning. The box plots of the processed data per run/replicate (i.e. each panel represents each run/replicate) are also provided for visual inspection.
knitr::kable(head(radar_spiked_result$ProcessedData))
lab Compound Treatment Fluor Conc Replicate
FRAUNH mDHT FETAX MT 113152 0 1
FRAUNH mDHT mDHT-1µg/L + 17MT 708 1 1
FRAUNH mDHT mDHT-2µg/L + 17MT 2498 2 1
FRAUNH mDHT mDHT-4µg/L + 17MT 10152 4 1
FRAUNH mDHT mDHT-8µg/L + 17MT 356 8 1
FRAUNH mDHT mDHT-16µg/L + 17MT 4621 16 1
radar_spiked_result$BoxPlots

  • Summary tables of the processed data (per replicate and overall)
knitr::kable(radar_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Summary statistics of fluorescence in different concentrations of test item per replicate
Conc Replicate N Mean Standard deviation Coefficient of variation
0 1 17 23818.471 43072.456 1.808364
0 2 17 15009.824 37062.085 2.469189
0 3 17 29657.294 37428.116 1.262021
1 1 17 6505.471 10835.321 1.665571
1 2 17 10271.000 14904.156 1.451091
1 3 17 20555.882 29472.508 1.433775
2 1 17 29385.941 47143.811 1.604298
2 2 17 8911.118 10565.718 1.185678
2 3 17 41815.706 54205.056 1.296285
4 1 17 15968.941 33341.972 2.087926
4 2 17 14532.765 21802.622 1.500239
4 3 17 19419.471 28057.724 1.444824
8 1 17 5178.118 8983.787 1.734952
8 2 17 8220.882 15084.270 1.834872
8 3 17 16399.647 23315.385 1.421700
16 1 17 6119.000 8448.481 1.380696
16 2 17 7955.706 15390.405 1.934512
16 3 17 17665.412 19842.708 1.123252
knitr::kable(radar_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Summary statistics of fluorescence in different concentrations of test item of all replicates
Conc N Mean Standard deviation Coefficient of variation
0 51 22828.529 38967.63 1.706971
1 51 12444.118 20556.80 1.651929
2 51 26704.255 43299.93 1.621462
4 51 16640.392 27641.60 1.661115
8 51 9932.882 17189.94 1.730609
16 51 10580.039 15836.94 1.496870
  • Tables of results evaluated using increasing/decreasing Williams test and/or Dunnett’s test, if applicable.

In the the Williams’ test result tables, Y.Tilde is the amalgamated mean of the fluorescence in each treatment group, Y0 is the mean of the control fluorescence, DIFF is the estimated difference between the treatment and the control, SE_DIFF is the standard error of the Williams’ test, DF is the degrees of freedom for Williams’ test, WILL Incr or Will Decr is the Williams’ test statistic, crit Val is the critical value of Williams distribution, Sign suggests if there is significant difference between the treatment and the control, and %Incr is the percent increase of the fluorescence compared to the control.

In the Dunnett’s test result table, Estimate is the estimated difference between the treatment and the control, SE is the standard error of the mixed ANOVA model, t value is the Dunnett’s test statistic, adj p is the adjusted p value and %Incr is the percent increase of the fluorescence compared to the control.

knitr::kable(radar_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Increasing Williams’ test
Conc Y.Tilde Y0 DIFF Incr SE_DIFF DF WILL Incr crit Val Sign % Incr
Conc16 - Conc0 16 8.38858 8.51927 -0.13069 0.3478327 10 -0.3757267 1.971 FALSE -53.65431
Conc8 - Conc0 8 8.38858 8.51927 -0.13069 0.3478327 10 -0.3757267 1.965 FALSE -56.48917
Conc4 - Conc0 4 8.38858 8.51927 -0.13069 0.3478327 10 -0.3757267 1.956 FALSE -27.10703
Conc2 - Conc0 2 8.38858 8.51927 -0.13069 0.3478327 10 -0.3757267 1.940 FALSE 16.97755
Conc1 - Conc0 1 8.07433 8.51927 -0.44494 0.3478327 10 -1.2791783 1.908 FALSE -45.48875
knitr::kable(radar_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Decreasing Williams’ test
Conc Y.Tilde Y0 DIFF Decr SE_DIFF DF WILL Decr crit Val Sign % Incr
Conc16 - Conc0 16 7.99566 8.51927 0.52361 0.3478327 10 1.5053503 1.971 FALSE -53.65431
Conc8 - Conc0 8 7.99566 8.51927 0.52361 0.3478327 10 1.5053503 1.965 FALSE -56.48917
Conc4 - Conc0 4 8.54577 8.51927 -0.02650 0.3478327 10 -0.0761861 1.956 FALSE -27.10703
Conc2 - Conc0 2 8.54577 8.51927 -0.02650 0.3478327 10 -0.0761861 1.940 FALSE 16.97755
Conc1 - Conc0 1 8.54577 8.51927 -0.02650 0.3478327 10 -0.0761861 1.908 FALSE -45.48875
knitr::kable(radar_spiked_result$Dunnetts, caption="Dunnett's test")
Dunnett’s test
Estimate SE df t value adj p % Incr
Conc1 - Conc0 -0.4449358 0.3478327 10 -1.279166 0.5986692 -45.48875
Conc2 - Conc0 0.4707977 0.3478327 10 1.353518 0.5520005 16.97755
Conc4 - Conc0 0.0536514 0.3478327 10 0.154245 0.9999132 -27.10703
Conc8 - Conc0 -0.5776610 0.3478327 10 -1.660744 0.3794252 -56.48917
Conc16 - Conc0 -0.4695496 0.3478327 10 -1.349930 0.5543160 -53.65431
  • Further information about the normality test (Shapiro-Wilk), the homogeneity of variance test (Levene’s test) of residuals of the mixed ANOVA model, the monotonicity test and the model fit.
radar_spiked_result$NormalityTest
#> 
#>  Shapiro-Wilk normality test
#> 
#> data:  stats::resid(mixedaov)
#> W = 0.99071, p-value = 0.05007
radar_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#>   statistic p.value    df df.residual
#>       <dbl>   <dbl> <int>       <int>
#> 1     0.474   0.796     5         300
radar_spiked_result$`Monotonicity Test`
#>        Test t value Pr(>|t|) Significance
#> 1    Linear   -1.57   0.1167            .
#> 2 Quadratic   -1.48   0.1400            .
radar_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: 
#> log(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#>    Data: dataframe
#> REML criterion at convergence: 1218.932
#> Random effects:
#>  Groups         Name        Std.Dev.
#>  Replicate:Conc (Intercept) 0.000   
#>  Replicate      (Intercept) 0.466   
#>  Residual                   1.756   
#> Number of obs: 306, groups:  
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept)       Conc.L       Conc.Q       Conc.C  
#>     8.35799     -0.37806     -0.37347      0.01863  
#>      Conc^4       Conc^5  
#>     0.68924     -0.25055  
#> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings

The list output from running the data_prep() function can be summarized with the data_summary() function.

xeredar::data_summary(radar_spiked_result) |>
  knitr::kable()
1 2 4 8 16
Replicate 1 -72.69 23.37 -32.96 -78.26 -74.31
Replicate 2 -31.57 -40.63 -3.18 -45.23 -47
Replicate 3 -30.69 41 -34.52 -44.7 -40.43
Pooled -45.49 16.98 -27.11 -56.49 -53.65
Dunnett ns ns ns ns ns
IncreasingWilliams ns ns ns ns ns
DecreasingWilliams ns ns ns ns ns

Running default REACTIV analysis

To demonstrate how to run REACTIV analysis, we will use one artificial data set containing the spiked and unspiked measurements.

reactiv_spiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Spiked 
reactiv_unspiked <- xeredar::REACTIV_valid_data_table_spiked_unspiked[["Anastrozole_UK"]]$Unspiked 

The default REACTIV analysis can be run using the data_prep() function with either spiked or unspiked data. This function automatically decides whether trimming, outlier removal and/or transformations are conducted following the manuscript (Spyridonov et al. unpublished). The actual analysis is carried out by the ana() function. The ana() function is called by the data_prep() function and does not need to be called separately. The analysis the REACTIV assay follows the description of the Method 2 (the mixed ANOVA approach) in the Annex 8: methods for the statistical analysis of REACTIV assay data of the Amended Draft new Test Guideline for the REACTIV assay for second WNT-review (30.01.2024). For this assay, the row argument should be set to FALSE. Trimming is not required. In case there are residuals deviate from normality and variance homogeneity, outlier removal (e.g. by applying the Tukey rule (Green et al., 2018) and data transformation (for example log- or square-root) can be conducted.

reactiv_spiked_result <- xeredar::data_prep(dataframe = reactiv_spiked, row = FALSE, trimming=FALSE, boxcox = FALSE)
reactiv_unspiked_result <- xeredar::data_prep(dataframe = reactiv_unspiked, row= FALSE, trimming=FALSE, boxcox = FALSE)

Here we use the spiked data as an example to demonstrate the output of the data_prep() function.

The outputs of the data_prep() function are lists containing the following elements:

  • A reasoning for the recommended transformation and trimming. The data from which outliers were removed with the Tukey-rule where the fluorescence values are square-root transformed, should be used for the analysis because only after outlier removal and sqrt transformation, the residuals of the mixed ANOVA are normally distributed and have homogeneous variances among treatment groups
reactiv_spiked_result$Justify
#> [1] "The data from which outliers were removed with the\n                    Tukey-rule where the fluorescence values are square-root\n                    transformed, should be used for the analysis because only\n                    after outlier removal and sqrt transformation, the\n                    residuals of the mixed ANOVA are normally distributed\n                    and have homogeneous variances among treatment groups"
  • A data frame of the processed data (e.g. raw data, transformed, or outlier removed) used for actual statistical testing following the reasoning. The box plots of the processed data per run/replicate (i.e. each panel represents each run/replicate) are also provided for visual inspection.
knitr::kable(head(reactiv_spiked_result$ProcessedData))
Treatment Fluor Conc Replicate Compound Country
AN-0,18mg/L + Testostérone 27037.33 0.18 1 Anastrozole UK
AN-0,36mg/L + Testostérone 201316.67 0.36 1 Anastrozole UK
AN-0,73mg/L + Testostérone 34651.00 0.73 1 Anastrozole UK
AN-1,45mg/L + Testostérone 289984.67 1.45 1 Anastrozole UK
AN-2,9mg/L + Testostérone 233271.00 2.9 1 Anastrozole UK
AN-0,18mg/L + Testostérone 64846.33 0.18 1 Anastrozole UK
reactiv_spiked_result$BoxPlots

  • Summary tables of the processed data (per replicate and overall)
knitr::kable(reactiv_spiked_result$SummaryDF_Rep, caption="Summary statistics of fluorescence in different concentrations of test item per replicate")
Summary statistics of fluorescence in different concentrations of test item per replicate
Conc Replicate N Mean Standard deviation Coefficient of variation
0 1 6 1417505.4 367130.47 0.2589976
0 2 5 1299380.0 241756.84 0.1860555
0 3 6 1699476.3 382759.66 0.2252221
0.18 1 8 216154.0 211883.73 0.9802445
0.18 2 7 149156.5 116877.63 0.7835909
0.18 3 8 255036.8 162183.70 0.6359228
0.36 1 8 149004.8 96508.01 0.6476841
0.36 2 7 230237.3 105711.23 0.4591404
0.36 3 7 199221.1 116329.38 0.5839210
0.73 1 8 169331.4 115443.55 0.6817610
0.73 2 7 187208.8 215340.83 1.1502710
0.73 3 8 292332.4 139121.01 0.4759001
1.45 1 8 116438.8 102761.27 0.8825347
1.45 2 7 173724.2 134626.51 0.7749439
1.45 3 7 236771.7 100056.61 0.4225869
2.9 1 8 159732.2 78955.44 0.4942987
2.9 2 6 234475.2 170916.42 0.7289317
2.9 3 8 247387.4 127553.16 0.5156009
knitr::kable(reactiv_spiked_result$SummaryDF, caption="Summary statistics of fluorescence in different concentrations of test item of all replicates")
Summary statistics of fluorescence in different concentrations of test item of all replicates
Conc N Mean Standard deviation Coefficient of variation
0 17 1482281.8 363637.5 0.2453228
0.18 23 209287.9 168250.6 0.8039193
0.36 22 190829.4 106636.7 0.5588065
0.73 23 217555.3 161918.4 0.7442633
1.45 22 172953.7 118883.2 0.6873702
2.9 22 211991.3 126959.5 0.5988900
  • Tables of results evaluated using increasing/decreasing Williams test and/or Dunnett’s test, if applicable.

In the Williams’ test result tables, Y.Tilde is the amalgamated mean of the fluorescence in each treatment group, Y0 is the mean of the control fluorescence, DIFF is the estimated difference between the treatment and the control, SE_DIFF is the standard error of the Williams’ test, DF is the degrees of freedom for Williams’ test, WILL Incr or Will Decr is the Williams’ test statistic, crit Val is the critical value of Williams distribution, Sign suggests if there is significant difference between the treatment and the control, and %Incr is the percent increase of the fluorescence compared to the control.

In the Dunnett’s test result table, Estimate is the estimated difference between the treatment and the control, SE is the standard error of the mixed ANOVA model, t value is the Dunnett’s test statistic, adj p is the adjusted p value and %Incr is the percent increase of the fluorescence compared to the control.

knitr::kable(reactiv_spiked_result$WilliamsIncrease, caption="Increasing Williams' test")
Increasing Williams’ test
Conc Y.Tilde Y0 DIFF Incr SE_DIFF DF WILL Incr crit Val Sign % Incr
Conc2.9 - Conc0 2.9 439.658 1208.915 -769.2567 48.94620 12 -15.71637 1.933 FALSE -85.69831
Conc1.45 - Conc0 1.45 413.890 1208.915 -795.0246 48.95169 12 -16.24101 1.927 FALSE -88.33193
Conc0.73 - Conc0 0.73 413.890 1208.915 -795.0246 48.47768 12 -16.39981 1.918 FALSE -85.32295
Conc0.36 - Conc0 0.36 413.890 1208.915 -795.0246 48.95169 12 -16.24101 1.903 FALSE -87.12597
Conc0.18 - Conc0 0.18 413.890 1208.915 -795.0246 48.47768 12 -16.39981 1.873 FALSE -85.88070
knitr::kable(reactiv_spiked_result$WilliamsDecrease, caption="Decreasing Williams' test")
Decreasing Williams’ test
Conc Y.Tilde Y0 DIFF Decr SE_DIFF DF WILL Decr crit Val Sign % Incr
Conc2.9 - Conc0 2.9 411.240 1208.915 797.6747 48.94620 12 16.29697 1.933 TRUE -85.69831
Conc1.45 - Conc0 1.45 411.240 1208.915 797.6747 48.95169 12 16.29514 1.927 TRUE -88.33193
Conc0.73 - Conc0 0.73 424.957 1208.915 783.9576 48.47768 12 16.17152 1.918 TRUE -85.32295
Conc0.36 - Conc0 0.36 424.957 1208.915 783.9576 48.95169 12 16.01493 1.903 TRUE -87.12597
Conc0.18 - Conc0 0.18 424.957 1208.915 783.9576 48.47768 12 16.17152 1.873 TRUE -85.88070
knitr::kable(reactiv_spiked_result$Dunnetts, caption="Dunnett's test")
Dunnett’s test
Estimate SE df t value adj p % Incr
Conc0.18 - Conc0 -786.2696 48.47768 12 -16.21921 0 -85.88070
Conc0.36 - Conc0 -786.3430 48.95169 12 -16.06365 0 -87.12597
Conc0.73 - Conc0 -776.5006 48.47768 12 -16.01769 0 -85.32295
Conc1.45 - Conc0 -822.2317 48.95169 12 -16.79680 0 -88.33193
Conc2.9 - Conc0 -769.7654 48.94620 12 -15.72677 0 -85.69831
  • Further information about the normality test (Shapiro-Wilk), the homogeneity of variance test (Levene’s test) of residuals of the mixed ANOVA model, the monotonicity test and the model fit.
reactiv_spiked_result$NormalityTest
#> 
#>  Shapiro-Wilk normality test
#> 
#> data:  stats::resid(mixedaov)
#> W = 0.97621, p-value = 0.0227
reactiv_spiked_result$LeveneTest
#> # A tibble: 1 × 4
#>   statistic p.value    df df.residual
#>       <dbl>   <dbl> <int>       <int>
#> 1     0.829   0.532     5         123
reactiv_spiked_result$`Monotonicity Test`
#>        Test t value Pr(>|t|) Significance
#> 1    Linear   -6.34  <0.0001          ***
#> 2 Quadratic    6.05  <0.0001          ***
reactiv_spiked_result$MixedAnova
#> Linear mixed model fit by REML ['lmerMod']
#> Formula: 
#> sqrt(Fluor) ~ Conc + (1 | Replicate) + (1 | Replicate:Conc)
#>    Data: wt_outlier
#> REML criterion at convergence: 1607.975
#> Random effects:
#>  Groups         Name        Std.Dev.
#>  Replicate:Conc (Intercept)   0.00  
#>  Replicate      (Intercept)  51.74  
#>  Residual                   151.56  
#> Number of obs: 129, groups:  
#> Replicate:Conc, 18; Replicate, 3
#> Fixed Effects:
#> (Intercept)       Conc.L       Conc.Q       Conc.C  
#>      551.13      -471.74       437.64      -271.05  
#>      Conc^4       Conc^5  
#>      175.76       -30.96  
#> optimizer (nloptwrap) convergence code: 0 (OK) ; 0 optimizer warnings; 1 lme4 warnings

The list output from running the data_prep() function can be summarized with the data_summary() function.

xeredar::data_summary(reactiv_spiked_result) |>
  knitr::kable()
0.18 0.36 0.73 1.45 2.9
Replicate 1 -84.75 -89.49 -88.05 -91.79 -88.73
Replicate 2 -88.52 -82.28 -85.59 -86.63 -81.95
Replicate 3 -84.99 -88.28 -82.8 -86.07 -85.44
Pooled -85.88 -87.13 -85.32 -88.33 -85.7
Dunnett * * * * *
IncreasingWilliams ns ns ns ns ns
DecreasingWilliams * * * * *

The shiny app

xeredar contains an integrated shiny app that is available to the users by writing run_app() into the console. The only requirement to use the app is the successful installation of xeredar. When the app started, the user has a couple of options to analyse the data. These options can be adjusted by clicking on the little gear sign next to Inputs. The following options are available:

  • Use d’Agostino test?
    • The default here is that this is not selected meaning the Shapiro-Wilk test is utilized to check for residual normality. However, the RADAR TG mentions the d’Agostino test which is why it is also available to the users.
  • Apply 10% Trimming
    • The default here is that this is selected, meaning that 10% Trimming is conducted when the ANOVA assumptions are not fulfilled by the raw data. Please be aware that this does not mean that 10% Trimming is always conducted. It is only conducted, when the raw data is violating the residual normality and variance homogeneity assumptions, tested with the respective test with an adjustable alpha level.
  • Remove outliers
    • The default here is that this is selected, meaning that outlier removal is conducted when the ANOVA assumptions are not fulfilled by the raw data or by 10% Trimming, if it is selected above.
  • Try Box-Cox transformation
    • The default here is that this is selected, meaning that box-cox transformation is carried out when the ANOVA assumptions are not fulfilled by the raw or processed data (10% Trimming or outlier removal) nor by log- or square-root transformation. Box-Cox transformation is not mentioned in any of the TGs which is why it is left to the choice of the user.
  • alpha for residual normality and variance homogeneity tests
    • The default here is 0.05. In the TGs the alpha level is discussed so the user is adviced to inspect the requirements of the respective study to analyze.

The next option in the little box on the top-left of the app is to set a hook or remove the hook to decide whether the exposure well ID is regarded as random term in the underlying mixed ANOVA model. In case a REACTIV study is investigated the hook should be removed. For RADAR and XETA, the hook should be placed as long as information about the exposure well ID was documented. Of course, reducing the complexity of the random term might also make sense for XETA and RADAR studies in case no variance is explained by the exposure well-ID. However, this choice is left to the user.

When own data is supposed to be analyzed xlsx files with either spiked or unspiked data can be uploaded. Please make sure that the uploaded data fulfills the Data requirements explained above. In case of insecurity, inspect the uploaded data structure of the pre-loaded example data.

The app contains several output boxes about the required data processing, a conclusion table, the uploaded data, residual diagnostics plots, boxplots, a data summary table, the output of the residual normality and homogeneity tests, the monotonicity test result, the Dunnett’s and increasing and decreasing Williams test result tables along with the summary table of the underlying mixed ANOVA model.

The depicted information can be downloaded in a simple report by clicking on the button ** Word report**. It takes a couple of seconds until the final docx file is produced and ready to download. Avoid clicking the button several times as this can lead to long waiting times and the production of several reports.

References

OECD. 2019a. Validation Report of the Xenopus Eleutheroembryonic Thyroid Signaling Assay (XETA) for the Detection of Thyroid Active Substances. OECD.

OECD. 2019b. TG 248: Xenopus Eleutheroembryonic Thyroid Assay (XETA). OECD. https://doi.org/10.1787/a13f80ee-en.

OECD. 2022b. Test No. 251: Rapid Androgen Disruption Activity Reporter (RADAR) Assay. OECD. https://doi.org/10.1787/da264d82-en.

OECD. 2022a. Rapid Estrogen Activity in Vivo (REACTIV) Assay (OECD Draft TG): Guideline for the Testing of Chemicals, Section 2: Effects on Biotic System. OECD.