-
runArraySimulation()
now correctly searches in.GlobalEnv
for user defined functions -
manageWarnings(... suppress)
argument now allows for partial matching and other regex inputs -
SimCollect()
now automatically checks whether all files are expected to be present viaSimCheck()
-
runArraySimulation()
gains aarray2row
function to allow array jobs to index multiple conditions in thedesign
object (default uses onearrayID
per row, the original behaviour) -
runArraySimulation()
gainsparallel
flag and friends to use multi-core processing within array distributions. RNG numbers within the L'Ecuyer-CMRG algorithm are incremented usingparallel::nextRNGSubStream()
within each defined core -
Better name checking when using the supported
list
inputs inrunSimulation()
andrunArraySimulation()
-
SimCollect()
more efficient when combining a large number of files (e.g., greater than 5000.rds
files stored viarunArraySimulation()
). Gains adir
argument for this purpose as well so that a full directory can be specified -
SimCheck()
repurposed to check for missing files forrunArraySimulation()
-
Fix for
SimCollect()
whenrunArraySimulation()
result contains mixed warning outputs (reported by Michael Troung) -
manageMessages()
added in a similar spirit tomanageWarnigns()
, though to change messages into either errors or warnings (default behavior is the same asquiet()
) -
manageWarnings()
gains ansuppress
argument to specify explicit warnings strings that can be suppressed (i.e., are known to be innocuous). This provides better coding practice than the nuclear alternativebase::suppressWarnings()
-
convertWarnings()
name changed tomanageWarning()
given its increased functionality. -
timeFormater()
function added to isolate logic of SBATCH time specification utility. Now used in several places of the package (e.g.,runArraySimulation()
,PBA()
,SimSolve()
) -
Switch to camel casing format in all functions (e.g.,
add_missing() -> addMissing()
,gen_seeds() -> genSeeds()
, etc). Exception is thataggregate_simulations()
has changed toSimCollect()
-
SimSolve()
gains apredCI.tol
argument to allow algorithm termination based advertised precision of the estimates
-
runSimulation(..., control = list(store_Random.seeds))
logical added to store all.Random.seed
replication states. Generally not recommended due to the size of these stored elements in larger simulations, however can be useful for debugging purposes where errors or warnings are not thrown -
runArraySimulation()
added to better support distributing array's of jobs on HPC clusters. Works best when combined with newexpandDesign()
function (see next point) and the improvedaggregate_simulations()
behaviour for more evenly distribution replication budgets across independent jobs. An associated vignette file has been added to the package to provide context and tutorial information for Slurm clusters -
expandDesign()
added to repeat the row conditions a number of times instead of just once. This is useful when exporting each condition independently to computing clusters, where each cluster contains only a fraction of the targetreplications
(see issue #33) -
getArrayID()
added to detect the array job ID (used withrunArraySimulation(..., arrayID)
) -
aggregate_simulations()
now requires explicitfilename
argument to save the collapsed simulation information -
aggregate_simulations()
generalized to detect whether theDesign
conditions have repeated row definitions and therefore should be conditionally averaged over (see newexpandDesign()
function) -
runArraySimulation()
andrunSimulation()
'scontrol
list gain newmax_time
andmax_RAM
arguments to evaluate simulation replications up until this time or RAM storage constraint is reached. In the event that the target replications are not reached the simulations up to this point, or the max RAM storage has been reached, then on the partial results will be returned (with a warning). This is mainly useful for HPC cluster jobs that require time and RAM constraints (e.g., 4 days per job; 4GB of RAM), where some jobs or simulation conditions may be more time/RAM consuming than others (requested by Mikko Rönkkö) -
Expose seed generation control per simulation condition via the function
gen_seeds()
, which also automatically constructs proper L'Ecuyer-CMRG seeds to be distributed across therunArraySimulation()
jobs
-
SimSolve()
function added to perform (stochastic) root-solving to estimate specific criteria from simulation studies. Currently supports uni-root type problems for continuous or discrete variables via the probabilistic bisection algorithm with bolstering and interpolations (ProBABLI), Brent's method, and the classical bisection approach, the latter two of which can be problematic if the number of replications per iteration are too low -
SFA()
function added for fitting surrogate functional forms to simulation results and subsequently solving specific roots. Supports single root or multi-root applications, where by default the modelling is performed via generalized linear models -
runSimulation(..., store_results = TRUE)
is now the default to automatically store the results fromAnalyse()
in the returned simulation object. If RAM issues are suspected thensave_results = TRUE
is still the recommended approach -
convertWarnings()
wrapper/post-hoc function added to convert specific warning messages to errors during simulations. Useful when only a subset of warnings are known to be problematic, while other warning messages (whether known or not) are treated as provisionally innocuous -
control
gains aprint_RAM
logical argument to suppress printing the RAM whenverbose = TRUE
. Disabling this can reduce execution time as garbage collector (gc()
) calls are avoided, which is required extract the current RAM state. Settingverbose = FALSE
will also automatically disable the RAM andgc()
calls and their overhead -
Attach()
now acceptsmatrix
input objects, and gains aRStudio_flags
argument to generate syntax that suppresses false positives about variables outside of the function's scope
-
Fix Github issue #26 related to extremely long warning/error messages
-
save_results_filename
added torunSimulation()
saving details to allow asynchronous (though unchecked) file storage to the same results directory (suggested by Jan Göttmann) -
ECR()
gain acomplement
logical to indicate whether parameter was outside advertised interval (complement of coverage). Useful when CIs are used as formal hypothesis tests (e.g., bootstrap CI tests for power) -
runSimulation(..., extra_options)
changed tocontrol
instead to control less commonly used flags
-
createDesign()
gains afractional
argument to support design input structures from theFrF2
package for fractional factorial designs. Useful when detecting main/low-dimensional interaction effects across a large number of factor variables (suggested by Achim Zeileis). Example added to the wiki to demonstrate its use -
When
summarise()
function not supplied theDesign
input is now appended to theresults
object when usingSimExtract(res, what = 'results'
). Only supported when theresults
object is amatrix
-like structure -
RAM
element added to resulting objects to indicate the amount of RAM used during each evaluation. This is particularly useful when usingrunSimulation(..., store_results = TRUE)
to inspect how much RAM is being being consumed (otherwise,runSimulation(..., save_results = TRUE)
should be used if RAM storage is suspected to be an issue) -
resummarise()
andaggregate_simulation()
now better support the internally stored results terms when usingstore_results = TRUE
-
runSimulation(..., save = TRUE)
changed tosave = replications > 10
to only write temporary files when the replications are larger (less hard-drive strain when initially testing simulation experiment with very small replications) -
hexsticker added to make
SimDesign
part of the cool-kids club -
filename
andsave_results_dirname
extractors added toSimExtract()
-
PBA()
function added for probabilistic bisection algorithm, with associatedprint()
andplot()
S3 methods -
debug
gains'-'
structure to allow debugging on specific rows of thedesign
input. For instance, if the simulation ran successfully until row 10, and unknown errors terminated the simulation, then usingrunSimulation(..., debug = 'error-10')
will initiate the debugger on the first instance for the 10th row conditions in the supplieddesign
object -
Progress reporting now includes abbreviated condition names and values in the console per condition
-
New function
nc()
to be used in situation where uniquely naming a vector or list according to the object names is useful (cf.x <- c(A,B,C)
, which typically returns an unnamed vector, tox <- nc(A,B,C)
, in whichnames(x)
is"A" "B" "C"
). This is mainly useful in theAnalyse()
step where objects must be named uniquely in order to track the results inSummarise()
-
Added
Bradley1978()
for test of Bradley's (1978) robustness interval for empirical detection/coverage rate statistics -
runSimulation(..., Generate)
can now be specified as a named list of functions similar toAnalyse()
, however only the first valid data generation function will be used as the constructor of the simulated data (see the newGenerateIf()
function to control the flow of these generation steps). This list input should really only be used when the population generation functions are differ widely depending on thecondition
under investigation -
SimFunctions()
adds a few new inputs for saving one or more files (save_structure
), defining one or more generate function (nGenerate
), whether to include an extra file for user-defined objects and functions (extra_file
), and whether a basicknitr::spin()
header should be included when saving the files (spin_header
)
-
Support the
future
package by usingrunSimulation(..., parallel = 'future')
to replace the built-in parallel processing inputs. Using thefuture
package approach makes several arguments torunSimulation()
unnecessary as these can be specified when definingfuture::plan()
(e.g.,cl
,MPI
, etc) -
When using the
future
approach theprogressr
package is used. Allows the progress bar to be started viaprogressr::with_progress()
and modified by the front-end user (see?runSimulatino
for an example usingprogressr::handler()
)
extra_options
gains support for.options.mpi
to control the MPI properties documented indoMPI
quite()
now removes the sunk connection temp file to save storage issues (e.g., when distributing on Slurm)Attach()
gains anomit
argument to omit specific elements from being attached to the working environment (default still attaches all objects supplied)
-
Using a list definition for
Analyse
input now executes all functions by default regardless of errors thrown. Error messages and seeds remain captured in the output, however are labelled according to the number of errors that were observed (e.g.,SimExtra(result, what = 'errors')
may return column with"ERROR: 2 INDEPENDENT ERRORS THROWN: ..."
). Previous early termination default can be reset by passingextra_options = list(try_all_analyse = FALSE)
torunSimulation()
. Special thanks to Mark Lai for bringing this to my attention on Issue #20 -
Added
beep
argument torunSimulation()
to play a beep message via thebeepr
-
Added
RSE()
function to compute the relative behaviour of the average standard error to the standard deviation of a set of parameter estimates across the replications (RSE = E(par_SEs) / SD(par_ests)
) -
Bugfix for new list input for analysis functions when error raised (reported by Mark Lai)
-
SimExtract()
gains afuzzy
argument to allow fuzzy matching of error and warning messages. This helps collapse very similar errors messages in the recorded tables, thereby improving how to discern any pattern in the errors/warnings (e.g., Messages such as "ERROR: system is computationally singular: reciprocal condition number = 9.63735e-18" and "ERROR: system is computationally singular: reciprocal condition number = 6.74615e-17" are effectively the same, and so their number of recorded occurrences should be collapsed) -
Added
AnalyseIf()
function to allow specific analysis function to be included explicitly. Useful when the defined analysis function is not compatible with a row-condition in theDesign
object. Only relevant when theanalyse
argument was defined as a named list of functions -
The
analyse
argument torunSimulation()
now accepts a namedlist
of functions rather than a single analysis function. This allows the user to separate the independent analyses into distinct functional blocks rather than having all analyses within the same function, and potentially allows for better modularity. Thedebug
argument now also accepts the names of these respective list elements to debug these function definitions quickly -
SimFunctions()
gains annAnalyses
argument to specify how many analysis functions should be templated (default is 1, retaining the previous package defaults)
-
Various performance improvements to reduce execution overhead (e.g.,
REPLICATION
ID now moved to anextra_option
as this was identified as a bottleneck) -
Meta-statistical functions now support a
fun(list, matrix)
input form to compute element-wise summaries that return amatrix
structure -
Summarise()
can now returnlist
arguments that can later be extracted viaSimExtract(sim, what = 'summarise')
. Consequently, because list outputs are now viable thepurrr
package has been added to thesuggests
list
-
Prevent
aggregate_simulations()
from overwriting files and directories accidentally. As well, the auto-detection of suitable .rds files has been removed as explicitly stating the files/directories to be aggregated is less error prone -
Removed
plyr::rbind.fill
in favour ofdplyr::bind_row()
, which removedplyr
as a dependency -
Attach()
now accepts multiple list-like objects as inputs -
Added
SimCheck()
for checking the state of a long-running simulation via inspecting the main temp file -
sessioninfo
package used in placed of the traditionalsessionInfo()
-
Print number of cores when parallel processing is in use
-
A number of arguments from
runSimulation()
moved intoextra_options
list argument to simplify documentation -
Parallel processing now uses FORK instead of PSOCK when on Unix machines by default
-
More natural use of
RPushbullet
by changing thenotification
input into one that accepts a character vector ("none", "condition", "complete") to sendpbPost()
call. Also more informative in the default messages sent
-
Added "Empirical Supremum Rejection Sampling" method to
rejectionSampling()
to find better constant M (useful when there are local minimums in thef(x)/g(x)
ratio) -
rejectionSampling()
made more general, with additional examples provided in the help files -
Bootstrap CI estimates moved into
runSimulation()
, deprecating the less optimalSimBoot()
-
runSimulation(..., save=TRUE)
now default to always store meta-information about the simulation state -
Added
renv
to the suggests lists since it's useful to hard-store package versions used in simulations -
data.frame
objects largely replaced withtibble
data frames instead as they render better for larger simulations -
Support for
rbind()
andcbind()
on final simulation results to add additional condition/meta-summary information -
Use
createDesign()
instead ofexpand.grid()
in code, which provides more structured information and flexibility -
Added
SimExtract()
to extract important but silent information -
Added
stop_on_fatal
logical argument to more aggressively terminate the simulation rather than do things more gracefully