diff --git a/DESCRIPTION b/DESCRIPTION index 4b666e5..f706460 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,7 +1,7 @@ Package: PvSTATEM Type: Package Title: Reading, Quality Control and Preprocessing of MBA (Multiplex Bead Assay) Data -Description: Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalizes the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project of the same name - 'PvSTATEM', which is an international project aiming for malaria elimination. +Description: Speeds up the process of loading raw data from MBA (Multiplex Bead Assay) examinations, performs quality control checks, and automatically normalises the data, preparing it for more advanced, downstream tasks. The main objective of the package is to create a simple environment for a user, who does not necessarily have experience with R language. The package is developed within the project of the same name - 'PvSTATEM', which is an international project aiming for malaria elimination. BugReports: https://github.com/mini-pw/PvSTATEM/issues Version: 0.0.4 License: BSD_3_clause + file LICENSE diff --git a/NAMESPACE b/NAMESPACE index f613128..5fee208 100644 --- a/NAMESPACE +++ b/NAMESPACE @@ -5,6 +5,7 @@ S3method(predict,Model) S3method(summary,Plate) export(PlateBuilder) export(create_standard_curve_model_analyte) +export(get_nmfi) export(generate_plate_report) export(is_valid_data_type) export(is_valid_sample_type) diff --git a/R/classes-plate_builder.R b/R/classes-plate_builder.R index 83de576..69d0b90 100644 --- a/R/classes-plate_builder.R +++ b/R/classes-plate_builder.R @@ -405,6 +405,15 @@ is_dilution <- function(character_vector) { is_valid_dilution } + + +#' Convert dilutions to numeric values +#' @description +#' Convert dilutions saved as strings in format `1/\d+` into numeric values +#' @param dilutions vector of dilutions used during the examination saved +#' as strings in format `1/\d+` +#' @return a vector of numeric values representing the dilutions +#' @keywords internal convert_dilutions_to_numeric <- function(dilutions) { stopifnot(is.character(dilutions)) diff --git a/R/get-nmfi.R b/R/get-nmfi.R new file mode 100644 index 0000000..59a3822 --- /dev/null +++ b/R/get-nmfi.R @@ -0,0 +1,112 @@ +#' @title Calculate normalised MFI values for a plate +#' +#' @description +#' The function calculates the normalised MFI (nMFI) values for each of the analytes in the plate. +#' +#' The nMFI values are calculated as the ratio of the MFI values of test samples to the MFI values of the standard curve samples with the target dilution. +#' +#' +#' +#' **When nMFI could be used?** +#' In general it is preferred to use Relative Antibody Unit (RAU) values for any kind of analysis. +#' However, in some cases it is impossible to fit a model to the standard curve samples. +#' This may happen if the MFI values of test samples are much higher than the MFI of standard curve samples. +#' Then, the prediction would require large data extrapolation, that could lead to unreliable results. +#' +#' In such cases, the nMFI values could be used as a proxy for RAU values, if we want, for instance, to account for plate-to-plate variation. +#' +#' @param plate (`Plate()`) a plate object for which to calculate the nMFI values +#' @param reference_dilution (`numeric(1) or character(1)`) the dilution value of the standard curve sample +#' to use as a reference for normalisation. Default is `1/400`. +#' It should refer to a dilution of a standard curve sample in the given plate object. +#' This parameter could be either a numeric value or a string. +#' In case it is a character string, it should have format `1/d+`, where `d+` is any positive integer. +#' @param data_type (`character(1)`) type of data to use for the computation. Median is the default +#' @param verbose (`logical(1)`) print additional information. Default is `TRUE` +#' +#' @return nmfi (`data.frame`) a data frame with normalised MFI values for each of the analytes in the plate and all test samples. +#' +#' @examples +#' +#' # read the plate +#' plate_file <- system.file("extdata", "CovidOISExPONTENT.csv", package = "PvSTATEM") +#' layout_file <- system.file("extdata", "CovidOISExPONTENT_layout.csv", package = "PvSTATEM") +#' +#' plate <- read_luminex_data(plate_file, layout_file) +#' +#' # artificially bump up the MFI values of the test samples (the Median data type is default one) +#' plate$data[["Median"]][plate$sample_types == "TEST", ] <- +#' plate$data[["Median"]][plate$sample_types == "TEST", ] * 10 +#' +#' # calculate the nMFI values +#' nmfi <- get_nmfi(plate, reference_dilution = 1 / 400) +#' +#' # we don't do any extrapolation and the values should be comparable accross plates +#' head(nmfi) +#' # different params +#' nmfi <- get_nmfi(plate, reference_dilution = "1/50") +#' +#' @export +get_nmfi <- + function(plate, + reference_dilution = 1 / 400, + data_type = "Median", + verbose = TRUE) { + stopifnot(inherits(plate, "Plate")) + + stopifnot(length(reference_dilution) == 1) + + # check if data_type is valid + stopifnot(is_valid_data_type(data_type)) + + # check if reference_dilution is numeric or string + if (is.character(reference_dilution)) { + reference_dilution <- + convert_dilutions_to_numeric(reference_dilution) + } + + stopifnot(is.numeric(reference_dilution)) + stopifnot(reference_dilution > 0) + + if (!reference_dilution %in% plate$get_dilution_values("STANDARD CURVE")) { + stop( + "The target ", + reference_dilution, + " dilution is not present in the plate." + ) + } + + + # get index of standard curve sample with the target dilution + reference_standard_curve_id <- + which( + plate$dilution_values == reference_dilution & + plate$sample_types == "STANDARD CURVE" + ) + stopifnot(length(reference_standard_curve_id) == 1) + + plate_data <- + plate$get_data( + analyte = "ALL", + sample_type = "ALL", + data_type = data_type + ) + + reference_mfi <- plate_data[reference_standard_curve_id, ] + + test_mfi <- + plate$get_data( + analyte = "ALL", + sample_type = "TEST", + data_type = data_type + ) + reference_mfi <- reference_mfi[rep(1, nrow(test_mfi)), ] + + nmfi <- test_mfi / reference_mfi + + rownames(nmfi) <- + plate$sample_names[plate$sample_types == "TEST"] + + + return(nmfi) + } diff --git a/R/process-plate.R b/R/process-plate.R index 4e0ba20..cdea4bc 100644 --- a/R/process-plate.R +++ b/R/process-plate.R @@ -1,20 +1,53 @@ -#' Process a plate and save computed RAU values to a CSV +VALID_NORMALISATION_TYPES <- c("RAU", "nMFI") + +is_valid_normalisation_type <- function(normalisation_type) { + normalisation_type %in% VALID_NORMALISATION_TYPES +} + +#' @title +#' Process a plate and save output values to a CSV #' #' @description -#' The behavior can be summarized as follows: +#' Depending on the `normalisation_type` argument, the function will compute the RAU or nMFI values for each analyte in the plate. +#' **RAU** is the default normalisation type. +#' +#' +#' The behavior of the function, in case of RAU normalisation type, can be summarized as follows: #' 1. Adjust blanks if not already done. #' 2. Fit a model to each analyte using standard curve samples. -#' 3. Predict RAU value for each analyte using the corresponding model. +#' 3. Compute RAU values for each analyte using the corresponding model. #' 4. Aggregate computed RAU values into a single data frame. -#' 5. Save that data frame to a CSV file. +#' 5. Save the computed RAU values to a CSV file. +#' +#' More info about the RAU normalisation can be found in +#' `create_standard_curve_model_analyte` function documentation \link[PvSTATEM]{create_standard_curve_model_analyte} or in the Model reference \link[PvSTATEM]{Model}. +#' +#' +#' +#' +#' In case the normalisation type is **nMFI**, the function will: +#' 1. Adjust blanks if not already done. +#' 2. Compute nMFI values for each analyte using the target dilution. +#' 3. Aggregate computed nMFI values into a single data frame. +#' 4. Save the computed nMFI values to a CSV file. +#' +#' More info about the nMFI normalisation can be found in `get_nmfi` function documentation \link[PvSTATEM]{get_nmfi}. #' #' @param plate (`Plate()`) a plate object -#' @param output_path (`character(1)`) path to save the computed RAU values -#' If not provided the file will be saved in the working directory with the name `RAU_{plate_name}.csv`. +#' @param output_path (`character(1)`) path to save the computed RAU values. +#' If not provided the file will be saved in the working directory with the name `{normalisation_type}_{plate_name}.csv`. #' Where the `{plate_name}` is the name of the plate. +#' @param normalisation_type (`character(1)`) type of normalisation to use. Available options are: +#' \cr \code{c(`r toString(VALID_NORMALISATION_TYPES)`)}. +#' In case #' @param data_type (`character(1)`) type of data to use for the computation. Median is the default #' @param adjust_blanks (`logical(1)`) adjust blanks before computing RAU values. Default is `FALSE` #' @param verbose (`logical(1)`) print additional information. Default is `TRUE` +#' @param reference_dilution (`numeric(1)`) target dilution to use as reference for the nMFI normalisation. Ignored in case of RAU normalisation. +#' Default is `1/400`. +#' It should refer to a dilution of a standard curve sample in the given plate object. +#' This parameter could be either a numeric value or a string. +#' In case it is a character string, it should have format `1/d+`, where `d+` is any positive integer. #' @param ... Additional arguments to be passed to the fit model function (`create_standard_curve_model_analyte`) #' #' @examples @@ -29,34 +62,78 @@ #' process_plate(plate, output_path = temporary_filepath) #' # create and save dataframe with computed dilutions #' +#' # nMFI normalisation +#' process_plate(plate, output_path = temporary_filepath, +#' normalisation_type = "nMFI", reference_dilution = 1/400) +#' +#' @return a data frame with normalised values #' @export -process_plate <- function(plate, output_path = NULL, data_type = "Median", adjust_blanks = FALSE, verbose = TRUE, ...) { - stopifnot(inherits(plate, "Plate")) - if (is.null(output_path)) { - output_path <- paste0("RAU_", plate$plate_name, ".csv") - } - stopifnot(is.character(output_path)) - stopifnot(is.character(data_type)) +process_plate <- + function(plate, + output_path = NULL, + normalisation_type = "RAU", + data_type = "Median", + adjust_blanks = FALSE, + verbose = TRUE, + reference_dilution = 1 / 400, + ...) { + stopifnot(inherits(plate, "Plate")) - if (!plate$blank_adjusted && adjust_blanks) { - plate <- plate$blank_adjustment(in_place = FALSE) - } + stopifnot(is_valid_normalisation_type(normalisation_type)) - test_sample_names <- plate$sample_names[plate$sample_types == "TEST"] - output_list <- list( - "SampleName" = test_sample_names - ) - verbose_cat("Fitting the models and predicting RAU for each analyte\n", verbose = verbose) - - for (analyte in plate$analyte_names) { - model <- create_standard_curve_model_analyte(plate, analyte, data_type = data_type, ...) - test_samples_mfi <- plate$get_data(analyte, "TEST", data_type = data_type) - test_sample_estimates <- predict(model, test_samples_mfi) - output_list[[analyte]] <- test_sample_estimates[, "RAU"] - } - output_df <- data.frame(output_list) + if (is.null(output_path)) { + output_path <- + paste0(normalisation_type, "_", plate$plate_name, ".csv") + } + stopifnot(is.character(output_path)) + stopifnot(is.character(data_type)) - verbose_cat("Saving the computed RAU values to a CSV file located in: '", output_path, "'\n", verbose = verbose) - write.csv(output_df, output_path, row.names = FALSE) -} + if (!plate$blank_adjusted && adjust_blanks) { + plate <- plate$blank_adjustment(in_place = FALSE) + } + if (normalisation_type == "nMFI") { + verbose_cat("Computing nMFI values for each analyte\n", verbose = verbose) + output_df <- + get_nmfi(plate, reference_dilution = reference_dilution, data_type = data_type) + verbose_cat( + "Saving the computed nMFI values to a CSV file located in: '", + output_path, + "'\n", + verbose = verbose + ) + } + else if (normalisation_type == "RAU") { + + # RAU normalisation + + test_sample_names <- + plate$sample_names[plate$sample_types == "TEST"] + output_list <- list() + verbose_cat("Fitting the models and predicting RAU for each analyte\n", + verbose = verbose) + + for (analyte in plate$analyte_names) { + model <- + create_standard_curve_model_analyte(plate, analyte, data_type = data_type, ...) + test_samples_mfi <- + plate$get_data(analyte, "TEST", data_type = data_type) + test_sample_estimates <- predict(model, test_samples_mfi) + output_list[[analyte]] <- test_sample_estimates[, "RAU"] + } + + output_df <- data.frame(output_list) + + verbose_cat("Saving the computed RAU values to a CSV file located in: '", + output_path, + "'\n", + verbose = verbose) + + rownames(output_df) <- test_sample_names + } + write.csv(output_df, output_path) + + return(output_df) + + + } diff --git a/man/convert_dilutions_to_numeric.Rd b/man/convert_dilutions_to_numeric.Rd new file mode 100644 index 0000000..863533f --- /dev/null +++ b/man/convert_dilutions_to_numeric.Rd @@ -0,0 +1,19 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/classes-plate_builder.R +\name{convert_dilutions_to_numeric} +\alias{convert_dilutions_to_numeric} +\title{Convert dilutions to numeric values} +\usage{ +convert_dilutions_to_numeric(dilutions) +} +\arguments{ +\item{dilutions}{vector of dilutions used during the examination saved +as strings in format \verb{1/\\d+}} +} +\value{ +a vector of numeric values representing the dilutions +} +\description{ +Convert dilutions saved as strings in format \verb{1/\\d+} into numeric values +} +\keyword{internal} diff --git a/man/get_nmfi.Rd b/man/get_nmfi.Rd new file mode 100644 index 0000000..ac791ae --- /dev/null +++ b/man/get_nmfi.Rd @@ -0,0 +1,63 @@ +% Generated by roxygen2: do not edit by hand +% Please edit documentation in R/get-nmfi.R +\name{get_nmfi} +\alias{get_nmfi} +\title{Calculate normalised MFI values for a plate} +\usage{ +get_nmfi( + plate, + reference_dilution = 1/400, + data_type = "Median", + verbose = TRUE +) +} +\arguments{ +\item{plate}{(\code{Plate()}) a plate object for which to calculate the nMFI values} + +\item{reference_dilution}{(\verb{numeric(1) or character(1)}) the dilution value of the standard curve sample +to use as a reference for normalisation. Default is \code{1/400}. +It should refer to a dilution of a standard curve sample in the given plate object. +This parameter could be either a numeric value or a string. +In case it is a character string, it should have format \verb{1/d+}, where \verb{d+} is any positive integer.} + +\item{data_type}{(\code{character(1)}) type of data to use for the computation. Median is the default} + +\item{verbose}{(\code{logical(1)}) print additional information. Default is \code{TRUE}} +} +\value{ +nmfi (\code{data.frame}) a data frame with normalised MFI values for each of the analytes in the plate and all test samples. +} +\description{ +The function calculates the normalised MFI (nMFI) values for each of the analytes in the plate. + +The nMFI values are calculated as the ratio of the MFI values of test samples to the MFI values of the standard curve samples with the target dilution. + +\strong{When nMFI could be used?} +In general it is preferred to use Relative Antibody Unit (RAU) values for any kind of analysis. +However, in some cases it is impossible to fit a model to the standard curve samples. +This may happen if the MFI values of test samples are much higher than the MFI of standard curve samples. +Then, the prediction would require large data extrapolation, that could lead to unreliable results. + +In such cases, the nMFI values could be used as a proxy for RAU values, if we want, for instance, to account for plate-to-plate variation. +} +\examples{ + +# read the plate +plate_file <- system.file("extdata", "CovidOISExPONTENT.csv", package = "PvSTATEM") +layout_file <- system.file("extdata", "CovidOISExPONTENT_layout.csv", package = "PvSTATEM") + +plate <- read_luminex_data(plate_file, layout_file) + +# artificially bump up the MFI values of the test samples (the Median data type is default one) +plate$data[["Median"]][plate$sample_types == "TEST", ] <- + plate$data[["Median"]][plate$sample_types == "TEST", ] * 10 + +# calculate the nMFI values +nmfi <- get_nmfi(plate, reference_dilution = 1 / 400) + +# we don't do any extrapolation and the values should be comparable accross plates +head(nmfi) +# different params +nmfi <- get_nmfi(plate, reference_dilution = "1/50") + +} diff --git a/man/process_plate.Rd b/man/process_plate.Rd index 814faa2..47397c2 100644 --- a/man/process_plate.Rd +++ b/man/process_plate.Rd @@ -2,41 +2,72 @@ % Please edit documentation in R/process-plate.R \name{process_plate} \alias{process_plate} -\title{Process a plate and save computed RAU values to a CSV} +\title{Process a plate and save output values to a CSV} \usage{ process_plate( plate, output_path = NULL, + normalisation_type = "RAU", data_type = "Median", adjust_blanks = FALSE, verbose = TRUE, + reference_dilution = 1/400, ... ) } \arguments{ \item{plate}{(\code{Plate()}) a plate object} -\item{output_path}{(\code{character(1)}) path to save the computed RAU values -If not provided the file will be saved in the working directory with the name \verb{RAU_\{plate_name\}.csv}. +\item{output_path}{(\code{character(1)}) path to save the computed RAU values. +If not provided the file will be saved in the working directory with the name \verb{\{normalisation_type\}_\{plate_name\}.csv}. Where the \code{{plate_name}} is the name of the plate.} +\item{normalisation_type}{(\code{character(1)}) type of normalisation to use. Available options are: +\cr \code{c(RAU, nMFI)}. +In case} + \item{data_type}{(\code{character(1)}) type of data to use for the computation. Median is the default} \item{adjust_blanks}{(\code{logical(1)}) adjust blanks before computing RAU values. Default is \code{FALSE}} \item{verbose}{(\code{logical(1)}) print additional information. Default is \code{TRUE}} +\item{reference_dilution}{(\code{numeric(1)}) target dilution to use as reference for the nMFI normalisation. Ignored in case of RAU normalisation. +Default is \code{1/400}. +It should refer to a dilution of a standard curve sample in the given plate object. +This parameter could be either a numeric value or a string. +In case it is a character string, it should have format \verb{1/d+}, where \verb{d+} is any positive integer.} + \item{...}{Additional arguments to be passed to the fit model function (\code{create_standard_curve_model_analyte})} } +\value{ +a data frame with normalised values +} \description{ -The behavior can be summarized as follows: +Depending on the \code{normalisation_type} argument, the function will compute the RAU or nMFI values for each analyte in the plate. +\strong{RAU} is the default normalisation type. + +The behavior of the function, in case of RAU normalisation type, can be summarized as follows: \enumerate{ \item Adjust blanks if not already done. \item Fit a model to each analyte using standard curve samples. -\item Predict RAU value for each analyte using the corresponding model. +\item Compute RAU values for each analyte using the corresponding model. \item Aggregate computed RAU values into a single data frame. -\item Save that data frame to a CSV file. +\item Save the computed RAU values to a CSV file. } + +More info about the RAU normalisation can be found in +\code{create_standard_curve_model_analyte} function documentation \link[PvSTATEM]{create_standard_curve_model_analyte} or in the Model reference \link[PvSTATEM]{Model}. + +In case the normalisation type is \strong{nMFI}, the function will: +\enumerate{ +\item Adjust blanks if not already done. +\item Compute nMFI values for each analyte using the target dilution. +\item Aggregate computed nMFI values into a single data frame. +\item Save the computed nMFI values to a CSV file. +} + +More info about the nMFI normalisation can be found in \code{get_nmfi} function documentation \link[PvSTATEM]{get_nmfi}. } \examples{ @@ -50,4 +81,8 @@ temporary_filepath <- file.path(tmp_dir, "output.csv") process_plate(plate, output_path = temporary_filepath) # create and save dataframe with computed dilutions +# nMFI normalisation +process_plate(plate, output_path = temporary_filepath, + normalisation_type = "nMFI", reference_dilution = 1/400) + } diff --git a/tests/testthat/test-get-nmfi.R b/tests/testthat/test-get-nmfi.R new file mode 100644 index 0000000..0db3993 --- /dev/null +++ b/tests/testthat/test-get-nmfi.R @@ -0,0 +1,105 @@ +library(testthat) + +test_that("get_nmfi works on our data with multiple parameters", { + # Read plate + path <- + system.file("extdata", + "CovidOISExPONTENT.csv", + package = "PvSTATEM", + mustWork = TRUE + ) + layout_path <- + system.file( + "extdata", + "CovidOISExPONTENT_layout.xlsx", + package = "PvSTATEM", + mustWork = TRUE + ) + + expect_no_error( + plate <- + read_luminex_data( + path, + format = "xPONENT", + layout_filepath = layout_path, + verbose = FALSE + ) + ) + + get_nmfi(plate, reference_dilution = 1 / 400) + get_nmfi(plate, reference_dilution = "1/50") + + get_nmfi(plate, + reference_dilution = 1 / 400, + data_type = "Mean" + ) +}) + +test_that("get_nmfi with incorrect params", { + path <- + system.file("extdata", + "CovidOISExPONTENT.csv", + package = "PvSTATEM", + mustWork = TRUE + ) + layout_path <- + system.file( + "extdata", + "CovidOISExPONTENT_layout.xlsx", + package = "PvSTATEM", + mustWork = TRUE + ) + + expect_no_error( + plate <- + read_luminex_data( + path, + format = "xPONENT", + layout_filepath = layout_path, + verbose = FALSE + ) + ) + + expect_error(get_nmfi(plate, reference_dilution = 1 / 401)) + + expect_error(get_nmfi(plate, reference_dilution = "1/401")) + + expect_error(get_nmfi( + plate, + reference_dilution = 1 / 400, + data_type = "incorrect" + )) +}) + + +test_that("get_nmfi on artificial plate", { + path <- + system.file("extdata", + "random.csv", + package = "PvSTATEM", + mustWork = TRUE + ) + layout_path <- + system.file("extdata", + "random_layout.xlsx", + package = "PvSTATEM", + mustWork = TRUE + ) + + expect_no_error(plate <- + read_luminex_data(path, format = "xPONENT", verbose = FALSE)) + + nmfi <- get_nmfi(plate, reference_dilution = 1 / 50) + + reference_dilution_index <- which(plate$dilution_values == 1 / 50) + + reference_dilution_values <- + plate$data[["Median"]][reference_dilution_index, ] + + mfi_values <- + plate$data[["Median"]][plate$sample_types == "TEST", ] + + for (i in 1:ncol(mfi_values)) { + expect_equal(nmfi[, i], mfi_values[, i] / reference_dilution_values[[i]]) + } +}) diff --git a/tests/testthat/test-process-plate.R b/tests/testthat/test-process-plate.R new file mode 100644 index 0000000..2502df7 --- /dev/null +++ b/tests/testthat/test-process-plate.R @@ -0,0 +1,62 @@ +library(testthat) + +test_that("Test processing of a plate", { + # Read plate + path <- system.file("extdata", "CovidOISExPONTENT.csv", package = "PvSTATEM", mustWork = TRUE) + layout_path <- system.file("extdata", "CovidOISExPONTENT_layout.xlsx", package = "PvSTATEM", mustWork = TRUE) + expect_no_error(plate <- read_luminex_data(path, format = "xPONENT", layout_filepath = layout_path, verbose = FALSE)) + + # Test processing of a plate + tmp_dir <- tempdir(check = TRUE) + test_output_path <- file.path(tmp_dir, "output.csv") + expect_no_error( + process_plate(plate, output_path = test_output_path) + ) + expect_true(file.exists(test_output_path)) + expect_no_error(dilutions <- read.csv(test_output_path)) + file.remove(test_output_path) + + + # Test additional parameters + expect_error( + process_plate(plate, output_path = test_output_path, data_type = "incorrect") + ) + + expect_error( + process_plate(plate, output_path = test_output_path, normalisation_type = "incorrect") + ) +}) + +test_that("Processing plate with nMFI", { + # Read plate + path <- system.file("extdata", "CovidOISExPONTENT.csv", package = "PvSTATEM", mustWork = TRUE) + layout_path <- system.file("extdata", "CovidOISExPONTENT_layout.xlsx", package = "PvSTATEM", mustWork = TRUE) + expect_no_error(plate <- read_luminex_data(path, format = "xPONENT", layout_filepath = layout_path, verbose = FALSE)) + + # Test processing of a plate + tmp_dir <- tempdir(check = TRUE) + test_output_path <- file.path(tmp_dir, "output.csv") + expect_no_error( + process_plate(plate, output_path = test_output_path, normalisation_type = "nMFI") + ) + # Test processing of a plate with reference dilution specified + expect_no_error( + process_plate(plate, output_path = test_output_path, normalisation_type = "nMFI", reference_dilution = "1/50") + ) + expect_true(file.exists(test_output_path)) + expect_no_error(dilutions <- read.csv(test_output_path)) + file.remove(test_output_path) + + # Test additional parameters + expect_error( + process_plate(plate, output_path = test_output_path, data_type = "incorrect", normalisation_type = "nMFI") + ) + + expect_error( + process_plate(plate, output_path = test_output_path, reference_dilution = "400", normalisation_type = "nMFI") + ) + + expect_error( + process_plate(plate, output_path = test_output_path, reference_dilution = "1/401", normalisation_type = "nMFI") + ) +}) diff --git a/vignettes/example_script.Rmd b/vignettes/example_script.Rmd index 8af5a3d..216cd80 100644 --- a/vignettes/example_script.Rmd +++ b/vignettes/example_script.Rmd @@ -4,10 +4,13 @@ author: "Tymoteusz KwieciƄski" date: "`r Sys.Date()`" vignette: > %\VignetteIndexEntry{Simple example of basic PvSTATEM package pre-release version functionalities} - %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} %\VignetteDepends{ggplot2} %\VignetteDepends{nplr} + %\VignetteEngine{knitr::rmarkdown} +editor_options: + markdown: + wrap: sentence --- ```{r setup, include=FALSE} @@ -23,7 +26,10 @@ knitr::opts_chunk$set( ## Reading the plate object -The basic functionality of the `PvSTATEM` package is reading raw MBA data. To present the package's functionalities, we use a sample dataset from the Covid OISE study, which is pre-loaded into the package. You might want to replace these variables with paths to your files on your local disk. Firstly, let us load the dataset as the `plate` object. +The basic functionality of the `PvSTATEM` package is reading raw MBA data. +To present the package's functionalities, we use a sample dataset from the Covid OISE study, which is pre-loaded into the package. +You might want to replace these variables with paths to your files on your local disk. +Firstly, let us load the dataset as the `plate` object. ```{r} library(PvSTATEM) @@ -38,10 +44,11 @@ plate <- read_luminex_data(plate_filepath, layout_filepath) # read the data plate ``` - ## Processing the whole plate -Once we have loaded the plate object we may process it using the function `process_plate`. This function fits a model to each analyte using the standard curve samples and computes RAU values for each analyte using the corresponding model. The computed RAU values are then saved to a CSV file. +Once we have loaded the plate object we may process it using the function `process_plate`. +This function fits a model to each analyte using the standard curve samples and computes RAU values for each analyte using the corresponding model. +The computed RAU values are then saved to a CSV file. ```{r} tmp_dir <- tempdir(check = TRUE) @@ -49,10 +56,12 @@ test_output_path <- file.path(tmp_dir, "output.csv") process_plate(plate, output_path = test_output_path) ``` -## Quality control and normalization details -Apart from the `process_plate` function, the package provides a set of functions that allow for more detailed and advanced quality control and normalization of the data. +## Quality control and normalisation details + +Apart from the `process_plate` function, the package provides a set of functions that allow for more detailed and advanced quality control and normalisation of the data. ### Plate summary and details + After the plate is successfully loaded, we can look at some basic information about it. ```{r} @@ -72,7 +81,8 @@ summary(plate) ### Quality control -The package can plot the RAU along the MFI values, allowing manual inspection of the standard curve. This method raises a warning in case the MFI values were not adjusted using the blank samples. +The package can plot the RAU along the MFI values, allowing manual inspection of the standard curve. +This method raises a warning in case the MFI values were not adjusted using the blank samples. ```{r} plot_standard_curve_analyte(plate, analyte_name = "OC43_S") @@ -84,7 +94,8 @@ print(plate$blank_adjusted) plot_standard_curve_analyte(plate, analyte_name = "OC43_S") ``` -We can also plot the standard curve for different analytes and data types. A list of all available analytes on the plate can be accessed using the command `plate$analyte_names`. +We can also plot the standard curve for different analytes and data types. +A list of all available analytes on the plate can be accessed using the command `plate$analyte_names`. By default, all the operations are performed on the `Median` value of the samples; this option can be selected from the `data_type` parameter of the function. @@ -96,18 +107,18 @@ plot_standard_curve_analyte(plate, analyte_name = "RBD_wuhan", data_type = "Avg This plot may be used to assess the quality of the standard curve and anticipate some of the potential issues with the data. For instance, if we plotted the standard curve for the analyte, `ME` we could notice that the `Median` value of the sample with RAU of `39.06` is abnormally large, which may indicate a problem with the data. - ```{r} plot_standard_curve_analyte(plate, analyte_name = "ME") plot_standard_curve_analyte(plate, analyte_name = "ME", log_scale = "all") ``` -The plotting function has more options, such as selecting which axis the log scale should be applied or reversing the curve. More detailed information can be found in the function documentation, accessed by executing the command `?plot_standard_curve_analyte`. +The plotting function has more options, such as selecting which axis the log scale should be applied or reversing the curve. +More detailed information can be found in the function documentation, accessed by executing the command `?plot_standard_curve_analyte`. -Another useful method of inspecting the potential errors of the data is `plot_mfi_for_analyte`. +Another useful method of inspecting the potential errors of the data is `plot_mfi_for_analyte`. This method plots the MFI values of standard curve samples for a given analyte along the boxplot of the MFI values of the test samples. -It helps identify the outlier samples and check if the test samples are within the range of the standard curve samples. +It helps identify the outlier samples and check if the test samples are within the range of the standard curve samples. ```{r} plot_mfi_for_analyte(plate, analyte_name = "OC43_S") @@ -115,29 +126,26 @@ plot_mfi_for_analyte(plate, analyte_name = "OC43_S") plot_mfi_for_analyte(plate, analyte_name = "Spike_6P") ``` -It can be seen that for the `Spike_6P` analyte, the MFI values don't fall within the range of the standard curve samples, which could be problematic for the model. The values of test RAU values will be extrapolated (up to a point) from the standard curve, which may lead to incorrect results. +It can be seen that for the `Spike_6P` analyte, the MFI values don't fall within the range of the standard curve samples, which could be problematic for the model. +The values of test RAU values will be extrapolated (up to a point) from the standard curve, which may lead to incorrect results. ### Normalization -After inspection, we may create the model for the standard curve of a certain antibody. -The model is fitted using the `nplr` package, which provides a simple interface -for fitting n-parameter logistic regression models, -but to create a clearer interface for the user, -we encapsulated this model into our own class called `Model` for simplicity. +After inspection, we may create the model for the standard curve of a certain antibody. +The model is fitted using the `nplr` package, which provides a simple interface for fitting n-parameter logistic regression models, but to create a clearer interface for the user, we encapsulated this model into our own class called `Model` for simplicity. The detailed documentation of the `Model` class can be found by executing the command `?Model`. The model is then used to predict RAU values of the samples based on the MFI values. -#### RAU vs dilution +#### RAU vs dilution -In order to distinguish between real dilution values (the ones known for the standard curve samples) from the dilution predictions (obtained using the fitted standard curve) we introduced into our package a unit called RAU (Relative Antibody Unit) which is equal to the dilution **prediction** multiplied by a $1,000,000$ in order to provide a more readable value. +In order to distinguish between real dilution values (the ones known for the standard curve samples) from the dilution predictions (obtained using the fitted standard curve) we introduced into our package a unit called RAU (Relative Antibody Unit) which is equal to the dilution **prediction** multiplied by a $1,000,000$ in order to provide a more readable value. #### Inner nplr model `nplr` package fits the model using the formula: -$$ y = B + \frac{T - B}{[1 + 10^{b \cdot (x_{mid} - x)}]^s},$$ -where: +$$ y = B + \frac{T - B}{[1 + 10^{b \cdot (x_{mid} - x)}]^s},$$ where: - $y$ is the predicted value, MFI in our case, @@ -153,15 +161,16 @@ where: - $s$ is the asymmetric coefficient. -This equation is referred to as the Richards' equation. More information about the model can be found in the `nplr` package documentation. - +This equation is referred to as the Richards' equation. +More information about the model can be found in the `nplr` package documentation. #### Predicting RAU By reversing that logistic function we can predict the dilution of the samples based on the MFI values. -The RAU value is then the predicted dilution of the sample multiplied by $1,000,000$. +The RAU value is then the predicted dilution of the sample multiplied by $1,000,000$. -In order to limit the extrapolation error from above (values above maximum RAU value $RAU_{max}$ for the standard curve samples) we clip all predictions above $M = RAU_{max} + \text{over_max_extrapolation}$ to $M$ where `over_max_extrapolation` is user controlled parameter to the `predict` function. By default `over_max_extrapolation` is set to $0$. +In order to limit the extrapolation error from above (values above maximum RAU value $RAU_{max}$ for the standard curve samples) we clip all predictions above $M = RAU_{max} + \text{over_max_extrapolation}$ to $M$ where `over_max_extrapolation` is user controlled parameter to the `predict` function. +By default `over_max_extrapolation` is set to $0$. #### Usage @@ -173,8 +182,8 @@ model <- create_standard_curve_model_analyte(plate, analyte_name = "OC43_S") model ``` -Since our `model` object contains all the characteristics and parameters of the fitted regression model. -The model can be used to predict the RAU values of the samples based on the MFI values. +Since our `model` object contains all the characteristics and parameters of the fitted regression model. +The model can be used to predict the RAU values of the samples based on the MFI values. The output above shows the most important parameters of the fitted model. The predicted values may be used to plot the standard curve, which can be compared to the sample values. @@ -194,9 +203,12 @@ predicted_rau <- predict(model, mfi_values) head(predicted_rau) ``` + The dataframe contains original MFI values and the predicted RAU values based on the model. -In order to allow extrapolation from above (up to a certain value) we can set `over_max_extrapolation` to a positive value. To illustrate that we can look at prediction plots. The `plot_standard_curve_analyte_with_model` takes any additional parameters and passes them to a `predict` method so we can visually see the effect of the `over_max_extrapolation` parameter. +In order to allow extrapolation from above (up to a certain value) we can set `over_max_extrapolation` to a positive value. +To illustrate that we can look at prediction plots. +The `plot_standard_curve_analyte_with_model` takes any additional parameters and passes them to a `predict` method so we can visually see the effect of the `over_max_extrapolation` parameter. ```{r} model <- create_standard_curve_model_analyte(plate, analyte_name = "Spike_6P") @@ -207,5 +219,18 @@ plot_standard_curve_analyte_with_model(plate, model, log_scale = c("all")) plot_standard_curve_analyte_with_model(plate, model, log_scale = c("all"), over_max_extrapolation = 100000) ``` +### nMFI +In some cases, the RAU values cannot be reliably calculated. This may happen when the MFI values of test samples are way higher than those of the standard curve samples. In that case, to avoid extrapolation but to be still able to compare the samples across the plates, we introduced a new unit called nMFI (Normalized MFI). The nMFI is calculated as the MFI value of the test sample divided by the MFI value of the standard curve sample with the selected dilution value. + +nMFI values of the samples can be calculated in two ways - using the `get_nmfi` function or with the `process_plate` function that also saves the output into the csv file by setting the `normalisation_type` parameter to `nMFI` in the `process_plate` function. + +```{r} +nmfi_values <- get_nmfi(plate) +# process plate with nMFI normalisation + +nmfi_output_path <- file.path(tmp_dir, "nmfi_output.csv") +process_plate(plate, output_path = nmfi_output_path, normalisation_type = "nMFI") + +``` diff --git a/vignettes/our_datasets.Rmd b/vignettes/our_datasets.Rmd index 9f9c64a..957e0d7 100644 --- a/vignettes/our_datasets.Rmd +++ b/vignettes/our_datasets.Rmd @@ -19,7 +19,7 @@ knitr::opts_chunk$set( # Introduction -Our package's main purpose is to read, perform quality control, and normalize raw MBA data. Unfortunately, different devices and labs have different data formats. We gathered a few datasets on which our package could be tested. This document describes the datasets and their sources. +Our package's main purpose is to read, perform quality control, and normalise raw MBA data. Unfortunately, different devices and labs have different data formats. We gathered a few datasets on which our package could be tested. This document describes the datasets and their sources. The majority of our datasets, available for the public are stored in the `extdata` folder of the package. The remaining ones - both private and the larger number of publicly available datasets are stored in the `OneDrive` folder, which is accessible to the package developers.