From 7601edaa80073756ecbf044b9d7c2578094bb523 Mon Sep 17 00:00:00 2001 From: DarkShadeKnigh Date: Tue, 20 Apr 2021 13:52:10 -0400 Subject: [PATCH 1/4] [Feature] Added targets spec --- Targets_implementation.Rmd | 115 ++++++++++++++++++ .../targets/depression_imputation_module.csv | 7 ++ assets/specs/targets/depression_targets.R | 64 ++++++++++ assets/specs/targets/modules_map.csv | 2 + 4 files changed, 188 insertions(+) create mode 100644 Targets_implementation.Rmd create mode 100644 assets/specs/targets/depression_imputation_module.csv create mode 100644 assets/specs/targets/depression_targets.R create mode 100644 assets/specs/targets/modules_map.csv diff --git a/Targets_implementation.Rmd b/Targets_implementation.Rmd new file mode 100644 index 00000000..bf342464 --- /dev/null +++ b/Targets_implementation.Rmd @@ -0,0 +1,115 @@ +--- +title: "Targets_implementation" +author: "Rostyslav Vyuha" +date: "April 14, 2021" +output: html_document +--- + +```{r setup, include=FALSE} +knitr::opts_chunk$set(echo = TRUE) +``` + +# Main Export Functions + +### Verify Targets + +The purpose of this function would be to verify the correct order names and arguments in the passed targets. +This is done by comparing it against the read in modules + + The function would contain 2 arguments **targets_source** and **modules_path**. +The **targets_source** would either be the tar_target list or in most cases the path to the _targets.R +The **modules_path** would be the file path to the modules_map.csv + +This function would not make any changes to the _targets.R file or the list itself +and would simply output warnings and a boolean representing the validity of the targets with the module. + +### Run bllflow Targets + +This function would be responsible for running the targets with arguments filled in by the bllflow object. + +Excluding the arguments mentioned above this function would contain 2 arguments **targets_source** and **bllflow_object**. +The **targets_source** would be identical to the one to verify targets. +The **bllflow_object** would be the bllflow object created upon config initialization with mandatory checks for modules.csv and variables.csv as well as a present working_data. + +The function would first run verify targets to confirm correct order and presence of steps. Then it would modify the tar_targets arguments to reflect their true value rather then the shorthand (roles). +Once the tar_targets were modified accordingly the _targets.R file is written and tar_make() is executed, letting targets handle the returns and the pipeline + +### Create _targets tepmlate + +This function would be responsible for creating the basic bllflow_targets.R file which would only be populated by the steps in modules.csv + +The function would once again contain only 2 arguments **target_path** and **modules_path** + +The function would utilize the shorthand(roles) notation when writing the functions for ease of use for the analyst + +# Contents of modules.csv + +### Step_id + +The step_id column must contain a unique identifier for the step being performed. + +### Step_function + +The step_function column contains the name of the function being performed in this step. This must match a function name present in the environment during execution. + +### Step_arguments + +The step_arguments column would contain the arguments for the function including our shorthand (roles). I believe we should bring back the roles() and formula notations from previous implementation to avoid any confusion later. + +#### Special Arguments + +*role* This would search for variables matching the role in variables.csv and be replaced with vecor of var names during run time. +*data* This would pass the object attached to the bllflow object inside the data list ie: bllflow$data[[]] +*formula* This would create a left side = right side formula ie: formula[role["outcome"], role["predictor"], sep = "+"] would result in "outcome1 + outcome2 + outcome3 ~ predictor1 + predictor2 + predictor3" + +### Step_description + +This column contains the step description this is used to populate comments in template creation function. It should contain a helpful description of what this step is responsible for. + +### Step_order + +This column contains the order in which steps should be executed + +# Contents of modules_map + +### Module_Name + +The name of the module being included + +### Module_Path + +The relative path to the module being loaded + +### Module_description + +The module being ran + +### Module_order + +The order in which modules are ran + +# Example function usage + +## Verify_targets +```{r, echo=FALSE} +verify_targets(targets_source = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") +``` +This returns TRUE if depression_targets.R contains everything inside modules_map and in correct order with correct arguments + +## run_bllflow_targets +```{r, echo=FALSE} +run_bllflow_targets(targets_source = "/assets/specs/targets/depression_targets.R", bllflow_object = hui_object) +``` + +This would create a _targets.R in base package directory using the targets found at targets_source. +It would essentially be a copy and paste except for the Special Arguments which would be populated using the bllflow_object + +## create_targets_tepmlate +```{r, echo=FALSE} +create_targets_tepmlate(target_path = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") +``` + +This would create a barebones depression_targets.R with only things found in the passed modules + + + diff --git a/assets/specs/targets/depression_imputation_module.csv b/assets/specs/targets/depression_imputation_module.csv new file mode 100644 index 00000000..58e99b17 --- /dev/null +++ b/assets/specs/targets/depression_imputation_module.csv @@ -0,0 +1,7 @@ +Step_ID,Step_function,Step_arguments,Step_description,Step_order +create_depression_score_imputation_dataset,create_depression_score_imputation_dataset_function,"(data = data[""study_dataset""], variables = role[""create_depression_score_imputation_dataset""], survey_cycle_variable = role[""survey_cycle""], survey_cycle_lower_limit = 2003, survey_cycle_upper_limit = 2014)",Create the dataset with which we will impute the depression score variable. Only include the survey cycles from 2003 to 2014 since mood disorder is one of the strongest predictors of depression score and it was only available during these cycles in the PUMF,1 +impute_depression_score,impute_depression_score_function,"(data = data[""create_depression_score_imputation_dataset""], outcome = role[""impute_depression_score_outcome""], predictors = role[""impute_depression_score_predictors""], num_multiple_imputations = 5, method = polr)",Imputes the depression score variables using the MICE method. Use a polytomous logistic regression method since there are multiple categories in the depression score variable.,2 +merge_depression_score_imputed_dataset,merge_depression_score_imputed_dataset_function,"(depression_score_imputed_data = data[""impute_depression_score""], study_dataset = data[""study_dataset""], merge_by = role[""id""])",Merge the depression score imputed dataset back into the original study dataset using the id column.,3 +create_mood_disorder_imputation_dataset,create_mood_disorder_imputation_dataset_function,"(data = data[""study_dataset""], variables = role[""create_mood_disorder_imputation_dataset""], survey_cycle_variable = role[""survey_cycle""], survey_cycle_lower_limit = 2001, survey_cycle_upper_limit = 2014)","Create the dataset with which we will impute the mood disorder variable. Include all the cycles we have, which is everything from 2001 to 2014.",4 +impute_mood_disorder,impute_mood_disorder_function,"(data = data[""create_mood_disorder_imputation_dataset""], outcome = role[""impute_mood_disorder_outcome""], predictors = role[""impute_mood_disorder_predictors""], num_multiple_imputations = 5, method = logreg)","Impute the mood disorder variable using MICE method with 5 iterations. Use the logsitc regression model since mood disorder has only 2 categories, Yes and No.",5 +merge_imputed_mood_disorder_data,merge_imputed_mood_disorder_data_function,"(mood_disorder_imputed_data = data[""impute_mood_disorder""], study_dataset = data[""study_dataset""], merge_by = role[""id""]",Merge the mood disorder imputed dataset back into the original study dataset using the id column.,6 \ No newline at end of file diff --git a/assets/specs/targets/depression_targets.R b/assets/specs/targets/depression_targets.R new file mode 100644 index 00000000..aba55807 --- /dev/null +++ b/assets/specs/targets/depression_targets.R @@ -0,0 +1,64 @@ +library(targets) +library(huiport) # The package containing functions found in depression_imputation_module +list( + # Create the dataset with which we will impute the depression score variable. Only include the survey cycles from 2003 to 2014 since mood disorder is one of the strongest predictors of depression score and it was only available during these cycles in the PUMF + tar_target( + create_depression_score_imputation_dataset, + create_depression_score_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_depression_score_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2003, + survey_cycle_upper_limit = 2014 + ), + # Imputes the depression score variables using the MICE method. Use a polytomous logistic regression method since there are multiple categories in the depression score variable. + tar_target( + impute_depression_score, + impute_depression_score_function( + data = data["create_depression_score_imputation_dataset"], + outcome = role["impute_depression_score_outcome"], + predictors = role["impute_depression_score_predictors"], + num_multiple_imputations = 5, + method = "polr" + ) + ), + # Merge the depression score imputed dataset back into the original study dataset using the id column. + tar_target( + merge_depression_score_imputed_dataset, + merge_depression_score_imputed_dataset_function( + depression_score_imputed_data = data["impute_depression_score"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + # Create the dataset with which we will impute the mood disorder variable. Include all the cycles we have, which is everything from 2001 to 2014. + tar_target( + create_mood_disorder_imputation_dataset, + create_mood_disorder_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_mood_disorder_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2001, + survey_cycle_upper_limit = 2014 + ) + ), + # Impute the mood disorder variable using MICE method with 5 iterations. Use the logsitc regression model since mood disorder has only 2 categories, Yes and No. + tar_target( + impute_mood_disorder, + impute_mood_disorder_function( + data = data["create_mood_disorder_imputation_dataset"], + outcome = role["impute_mood_disorder_outcome"], + predictors = role["impute_mood_disorder_predictors"], + num_multiple_imputations = 5, + method = "logreg" + ) + ), + # Merge the mood disorder imputed dataset back into the original study dataset using the id column. + tar_target( + merge_imputed_mood_disorder_data, + merge_imputed_mood_disorder_data_function( + mood_disorder_imputed_data = data["impute_mood_disorder"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ) \ No newline at end of file diff --git a/assets/specs/targets/modules_map.csv b/assets/specs/targets/modules_map.csv new file mode 100644 index 00000000..3af8193f --- /dev/null +++ b/assets/specs/targets/modules_map.csv @@ -0,0 +1,2 @@ +Module_Name,Module_Path,Module_Description,ModuleOrder +depression_imputation,./depression_imputation_module.csv,This module is responsible for imputing the depression score and mood disorder variables within the CCHS-PUMF from cycles 2001 to 2014.,1 \ No newline at end of file From dbcae4967ed11e5638324460e93dd40b4d0ed458 Mon Sep 17 00:00:00 2001 From: DarkShadeKnigh Date: Tue, 27 Apr 2021 06:45:54 -0400 Subject: [PATCH 2/4] . --- assets/specs/targets/depression_targets.R | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/assets/specs/targets/depression_targets.R b/assets/specs/targets/depression_targets.R index aba55807..5f81b4f0 100644 --- a/assets/specs/targets/depression_targets.R +++ b/assets/specs/targets/depression_targets.R @@ -1,5 +1,7 @@ library(targets) library(huiport) # The package containing functions found in depression_imputation_module +Hui_impute <- create_targets_tepmlate() + list( # Create the dataset with which we will impute the depression score variable. Only include the survey cycles from 2003 to 2014 since mood disorder is one of the strongest predictors of depression score and it was only available during these cycles in the PUMF tar_target( @@ -7,7 +9,7 @@ list( create_depression_score_imputation_dataset_function( data = data["study_dataset"], variables = role["create_depression_score_imputation_dataset"], - survey_cycle_variable = role["survey_cycle"], + survey_cycle_variable = role["survey_cycle"], survey_cycle_lower_limit = 2003, survey_cycle_upper_limit = 2014 ), @@ -15,7 +17,7 @@ list( tar_target( impute_depression_score, impute_depression_score_function( - data = data["create_depression_score_imputation_dataset"], + data = create_depression_score_imputation_dataset, outcome = role["impute_depression_score_outcome"], predictors = role["impute_depression_score_predictors"], num_multiple_imputations = 5, From 8a25d391867015b9333eb75d3b286ed29c24b017 Mon Sep 17 00:00:00 2001 From: Rostyslav Date: Tue, 4 May 2021 13:32:11 -0400 Subject: [PATCH 3/4] [Feature] Added module verification tests --- assets/specs/targets/targets_test.R | 19 ++ .../targets/verification_expected_input.R | 222 ++++++++++++++++++ 2 files changed, 241 insertions(+) create mode 100644 assets/specs/targets/targets_test.R create mode 100644 assets/specs/targets/verification_expected_input.R diff --git a/assets/specs/targets/targets_test.R b/assets/specs/targets/targets_test.R new file mode 100644 index 00000000..346000a3 --- /dev/null +++ b/assets/specs/targets/targets_test.R @@ -0,0 +1,19 @@ +context("Modules Test") +library(targets) +source("verification_expected_input.R") + +test_that("Module verification returns TRUE when it matches modules.csv",{ + expect_true(verify_targets(targets_source = input_one, modules_path = "./modules_map.csv")) +}) + +test_that("Module verification returns appropriate error when a module step is missing",{ + expect_error((verify_targets(targets_source = input_two, modules_path = "./modules_map.csv"), "Missing step merge_imputed_mood_disorder_data") +}) + +test_that("Module verification returns appropriate error when module steps are out of order",{ + expect_error((verify_targets(targets_source = input_two, modules_path = "./modules_map.csv"), "Wrong order of steps") +}) + +test_that("Module verification returns appropriate error when module step contains wrong arguments",{ + expect_error((verify_targets(targets_source = input_two, modules_path = "./modules_map.csv"), "create_depression_score_imputation_dataset contains invalid step arguments") +}) \ No newline at end of file diff --git a/assets/specs/targets/verification_expected_input.R b/assets/specs/targets/verification_expected_input.R new file mode 100644 index 00000000..c09b8730 --- /dev/null +++ b/assets/specs/targets/verification_expected_input.R @@ -0,0 +1,222 @@ +input_one <- list( + tar_target( + create_depression_score_imputation_dataset, + create_depression_score_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_depression_score_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2003, + survey_cycle_upper_limit = 2014 + ), + tar_target( + impute_depression_score, + impute_depression_score_function( + data = create_depression_score_imputation_dataset, + outcome = role["impute_depression_score_outcome"], + predictors = role["impute_depression_score_predictors"], + num_multiple_imputations = 5, + method = "polr" + ) + ), + tar_target( + merge_depression_score_imputed_dataset, + merge_depression_score_imputed_dataset_function( + depression_score_imputed_data = data["impute_depression_score"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + tar_target( + create_mood_disorder_imputation_dataset, + create_mood_disorder_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_mood_disorder_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2001, + survey_cycle_upper_limit = 2014 + ) + ), + tar_target( + impute_mood_disorder, + impute_mood_disorder_function( + data = data["create_mood_disorder_imputation_dataset"], + outcome = role["impute_mood_disorder_outcome"], + predictors = role["impute_mood_disorder_predictors"], + num_multiple_imputations = 5, + method = "logreg" + ) + ), + tar_target( + merge_imputed_mood_disorder_data, + merge_imputed_mood_disorder_data_function( + mood_disorder_imputed_data = data["impute_mood_disorder"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ) + +input_two <- list( + tar_target( + create_depression_score_imputation_dataset, + create_depression_score_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_depression_score_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2003, + survey_cycle_upper_limit = 2014 + ), + tar_target( + impute_depression_score, + impute_depression_score_function( + data = create_depression_score_imputation_dataset, + outcome = role["impute_depression_score_outcome"], + predictors = role["impute_depression_score_predictors"], + num_multiple_imputations = 5, + method = "polr" + ) + ), + tar_target( + merge_depression_score_imputed_dataset, + merge_depression_score_imputed_dataset_function( + depression_score_imputed_data = data["impute_depression_score"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + tar_target( + create_mood_disorder_imputation_dataset, + create_mood_disorder_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_mood_disorder_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2001, + survey_cycle_upper_limit = 2014 + ) + ), + tar_target( + impute_mood_disorder, + impute_mood_disorder_function( + data = data["create_mood_disorder_imputation_dataset"], + outcome = role["impute_mood_disorder_outcome"], + predictors = role["impute_mood_disorder_predictors"], + num_multiple_imputations = 5, + method = "logreg" + ) + ) + ) + +input_three <- list( + tar_target( + create_depression_score_imputation_dataset, + create_depression_score_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_depression_score_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2003, + survey_cycle_upper_limit = 2014 + ), + tar_target( + impute_depression_score, + impute_depression_score_function( + data = create_depression_score_imputation_dataset, + outcome = role["impute_depression_score_outcome"], + predictors = role["impute_depression_score_predictors"], + num_multiple_imputations = 5, + method = "polr" + ) + ), + tar_target( + merge_depression_score_imputed_dataset, + merge_depression_score_imputed_dataset_function( + depression_score_imputed_data = data["impute_depression_score"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + tar_target( + create_mood_disorder_imputation_dataset, + create_mood_disorder_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_mood_disorder_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2001, + survey_cycle_upper_limit = 2014 + ) + ), + tar_target( + merge_imputed_mood_disorder_data, + merge_imputed_mood_disorder_data_function( + mood_disorder_imputed_data = data["impute_mood_disorder"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + tar_target( + impute_mood_disorder, + impute_mood_disorder_function( + data = data["create_mood_disorder_imputation_dataset"], + outcome = role["impute_mood_disorder_outcome"], + predictors = role["impute_mood_disorder_predictors"], + num_multiple_imputations = 5, + method = "logreg" + ) + ) + + ) + input_four <- list( + tar_target( + create_depression_score_imputation_dataset, + create_depression_score_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_depression_score_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2003, + wrong_name = 2014 + ), + tar_target( + impute_depression_score, + impute_depression_score_function( + data = create_depression_score_imputation_dataset, + outcome = role["impute_depression_score_outcome"], + predictors = role["impute_depression_score_predictors"], + num_multiple_imputations = 5, + method = "polr" + ) + ), + tar_target( + merge_depression_score_imputed_dataset, + merge_depression_score_imputed_dataset_function( + depression_score_imputed_data = data["impute_depression_score"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ), + tar_target( + create_mood_disorder_imputation_dataset, + create_mood_disorder_imputation_dataset_function( + data = data["study_dataset"], + variables = role["create_mood_disorder_imputation_dataset"], + survey_cycle_variable = role["survey_cycle"], + survey_cycle_lower_limit = 2001, + survey_cycle_upper_limit = 2014 + ) + ), + tar_target( + impute_mood_disorder, + impute_mood_disorder_function( + data = data["create_mood_disorder_imputation_dataset"], + outcome = role["impute_mood_disorder_outcome"], + predictors = role["impute_mood_disorder_predictors"], + num_multiple_imputations = 5, + method = "logreg" + ) + ), + tar_target( + merge_imputed_mood_disorder_data, + merge_imputed_mood_disorder_data_function( + mood_disorder_imputed_data = data["impute_mood_disorder"], + study_dataset = data["study_dataset"], + merge_by = role["id"] + ) + ) + \ No newline at end of file From 82156d54227aa7b80594658084732001b2804937 Mon Sep 17 00:00:00 2001 From: Rostyslav Date: Tue, 1 Jun 2021 13:32:55 -0400 Subject: [PATCH 4/4] [Feature] Added flow diagram for verification and addressed minor changes in the PR by adding further description of modules_map --- Targets_implementation.Rmd | 69 ++++++++++++++----- .../targets/depression_imputation_module.csv | 2 +- assets/specs/targets/modules_map.csv | 2 +- assets/specs/targets/verify_targets _flow.txt | 31 +++++++++ 4 files changed, 85 insertions(+), 19 deletions(-) create mode 100644 assets/specs/targets/verify_targets _flow.txt diff --git a/Targets_implementation.Rmd b/Targets_implementation.Rmd index bf342464..e4e4c61c 100644 --- a/Targets_implementation.Rmd +++ b/Targets_implementation.Rmd @@ -23,6 +23,18 @@ The **modules_path** would be the file path to the modules_map.csv This function would not make any changes to the _targets.R file or the list itself and would simply output warnings and a boolean representing the validity of the targets with the module. +#### List of warnings +- error when a module step is missing +- error when module steps are out of order +- error when module step contains wrong arguments + +#### Example function usage + +```{r, echo=FALSE} +verify_targets(targets_source = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") +``` +This returns TRUE if depression_targets.R contains everything inside modules_map and in correct order with correct arguments + ### Run bllflow Targets This function would be responsible for running the targets with arguments filled in by the bllflow object. @@ -34,6 +46,15 @@ The **bllflow_object** would be the bllflow object created upon config initializ The function would first run verify targets to confirm correct order and presence of steps. Then it would modify the tar_targets arguments to reflect their true value rather then the shorthand (roles). Once the tar_targets were modified accordingly the _targets.R file is written and tar_make() is executed, letting targets handle the returns and the pipeline +#### Example function usage + +```{r, echo=FALSE} +run_bllflow_targets(targets_source = "/assets/specs/targets/depression_targets.R", bllflow_object = hui_object) +``` + +This would create a _targets.R in base package directory using the targets found at targets_source. +It would essentially be a copy and paste except for the Special Arguments which would be populated using the bllflow_object + ### Create _targets tepmlate This function would be responsible for creating the basic bllflow_targets.R file which would only be populated by the steps in modules.csv @@ -42,6 +63,29 @@ The function would once again contain only 2 arguments **target_path** and **mod The function would utilize the shorthand(roles) notation when writing the functions for ease of use for the analyst +#### Example function usage + +```{r, echo=FALSE} +create_targets_tepmlate(target_path = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") +``` + +This would create a barebones depression_targets.R with only things found in the passed modules + +### Create _targets list + +This functions would be responsible for creating a list containing tar_target objects. + +The function would accept 1 mandatory arguments **modules_path** and one optional argument **target_path**. +If a **target_path** is supplied the existing tar_targets list is read in and appended and verified before being returned, if no **target_path** is supplied a barebone template type list is created from the modules_map + +#### Example function usage + +```{r, echo=FALSE} +create_targets_list(modules_path = "/assets/specs/targets/modules_map.csv") +``` + +This would create a barebones tar_targets list with only things found in the passed modules + # Contents of modules.csv ### Step_id @@ -52,14 +96,18 @@ The step_id column must contain a unique identifier for the step being performed The step_function column contains the name of the function being performed in this step. This must match a function name present in the environment during execution. -### Step_arguments +### Step_argument_name + +The step_argument_name as the name implies contains the name of a single argument + +### step_argument_value -The step_arguments column would contain the arguments for the function including our shorthand (roles). I believe we should bring back the roles() and formula notations from previous implementation to avoid any confusion later. +The step_argument_value contains the value for a single argument that matches the name in step_argument_name #### Special Arguments *role* This would search for variables matching the role in variables.csv and be replaced with vecor of var names during run time. -*data* This would pass the object attached to the bllflow object inside the data list ie: bllflow$data[[]] +*data* This would pass the object attached to the bllflow object inside the data list ie: bllflow$data[[]], alternatively it can be a reference to data generated by a previous step. *formula* This would create a left side = right side formula ie: formula[role["outcome"], role["predictor"], sep = "+"] would result in "outcome1 + outcome2 + outcome3 ~ predictor1 + predictor2 + predictor3" ### Step_description @@ -91,25 +139,12 @@ The order in which modules are ran # Example function usage ## Verify_targets -```{r, echo=FALSE} -verify_targets(targets_source = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") -``` -This returns TRUE if depression_targets.R contains everything inside modules_map and in correct order with correct arguments + ## run_bllflow_targets -```{r, echo=FALSE} -run_bllflow_targets(targets_source = "/assets/specs/targets/depression_targets.R", bllflow_object = hui_object) -``` -This would create a _targets.R in base package directory using the targets found at targets_source. -It would essentially be a copy and paste except for the Special Arguments which would be populated using the bllflow_object -## create_targets_tepmlate -```{r, echo=FALSE} -create_targets_tepmlate(target_path = "/assets/specs/targets/depression_targets.R", modules_path = "/assets/specs/targets/modules_map.csv") -``` -This would create a barebones depression_targets.R with only things found in the passed modules diff --git a/assets/specs/targets/depression_imputation_module.csv b/assets/specs/targets/depression_imputation_module.csv index 58e99b17..796c5792 100644 --- a/assets/specs/targets/depression_imputation_module.csv +++ b/assets/specs/targets/depression_imputation_module.csv @@ -1,4 +1,4 @@ -Step_ID,Step_function,Step_arguments,Step_description,Step_order +step_ID,step_function,step_arguments,step_description,step_order create_depression_score_imputation_dataset,create_depression_score_imputation_dataset_function,"(data = data[""study_dataset""], variables = role[""create_depression_score_imputation_dataset""], survey_cycle_variable = role[""survey_cycle""], survey_cycle_lower_limit = 2003, survey_cycle_upper_limit = 2014)",Create the dataset with which we will impute the depression score variable. Only include the survey cycles from 2003 to 2014 since mood disorder is one of the strongest predictors of depression score and it was only available during these cycles in the PUMF,1 impute_depression_score,impute_depression_score_function,"(data = data[""create_depression_score_imputation_dataset""], outcome = role[""impute_depression_score_outcome""], predictors = role[""impute_depression_score_predictors""], num_multiple_imputations = 5, method = polr)",Imputes the depression score variables using the MICE method. Use a polytomous logistic regression method since there are multiple categories in the depression score variable.,2 merge_depression_score_imputed_dataset,merge_depression_score_imputed_dataset_function,"(depression_score_imputed_data = data[""impute_depression_score""], study_dataset = data[""study_dataset""], merge_by = role[""id""])",Merge the depression score imputed dataset back into the original study dataset using the id column.,3 diff --git a/assets/specs/targets/modules_map.csv b/assets/specs/targets/modules_map.csv index 3af8193f..7f3fef20 100644 --- a/assets/specs/targets/modules_map.csv +++ b/assets/specs/targets/modules_map.csv @@ -1,2 +1,2 @@ -Module_Name,Module_Path,Module_Description,ModuleOrder +module_name,module_path,module_description,module_order depression_imputation,./depression_imputation_module.csv,This module is responsible for imputing the depression score and mood disorder variables within the CCHS-PUMF from cycles 2001 to 2014.,1 \ No newline at end of file diff --git a/assets/specs/targets/verify_targets _flow.txt b/assets/specs/targets/verify_targets _flow.txt new file mode 100644 index 00000000..cea8966e --- /dev/null +++ b/assets/specs/targets/verify_targets _flow.txt @@ -0,0 +1,31 @@ +@startuml +!pragma useVerticalIf on +start +:Read in modules_map; +repeat :Read in module; + :Create step list; + :Add step list to module_list; +repeat while (Remaining modules) is (yes) +->no; +:Read in passed targets; +if (passed targets is a list) then(no) + :Read in the targets file; + :convert targets file into tar_targets list; +endif +repeat :Compare targets_list against module list; + if(module step is missing) then + #pink:Missing step ; + detach + elseif(module steps are out of order) then + #pink:Wrong order of steps; + detach + elseif(module step contains wrong arguments) then + #pink: contains invalid + step arguments: ; + detach + endif +repeat while (Remaining modules) is (yes) +->no; + +#palegreen:matching; +@enduml \ No newline at end of file