R package pmut
is a collection of utility functions that facilitate general predictive modeling work.
Function usages include but not limited to diagnostic visualization, model metric, data quality check.
Development version:
devtools::install_github("chengjunhou/pmut")
Pacakge vignette is available on RPubs.
-
pmut.edap.cont
: this function creates visualization with a line plot of a specified continuous feature against the response, plus a distribution histogram for that feature. -
pmut.edap.disc
: this function creates visualization with a line plot of a specified discrete feature against the response, plus a distribution histogram for that feature. -
pmut.edap
: this function creates visualization for a vector of features, using eitherpmut.edap.disc
orpmut.edap.cont
, depending on the feature class. -
pmut.data.pmis
: this function checks percenrage of NA (include empty string for character) for every column inside the data. -
pmut.data.same
: this function checks if there is any duplicated column inside the data. -
pmut.data.scal
: this function standardizes every numeric column inside the data. -
pmut.base.find
: this function finds the meta information for each column within training data, which will be used to process testing and/or new data so that it can be scored without error, checkpmut.base.prep
for the part of testing data processing. -
pmut.base.prep
: this function takes meta information generated bypmut.base.find
, prepares new data so that it can be scored without error. -
pmut.auc
: this function calculates area under the ROC curve for model prediction, without any package dependency. -
pmut.gini
: this function calculates the standardized gini coefficient for model prediction.