[help] Advice on highly heterogeneous workflows #1400
-
Help
DescriptionOur workflow is highly heterogeneous, ranging from single targets that use lots of memory on their own, to dynamically branched targets that are embarrassingly parallel. I have set up heterogeneous workers for the different parts of the pipeline on their own, but it seems they all run at the same time - in reality I was hoping that this would separate out the running of the pipeline, so that we don't run out of memory on our machine! As a quickfix I am running something like: targets::tar_make("highly_parallel_target") # with lots of branches
targets::tar_make("expensive_target")
targets::tar_make() # remainder of the targets I was wondering if there was any advice on dealing with such a pipeline, and, if any, best practices that should be followed. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Successive library(targets)
list(
tar_target(name = expensive, command = f()),
tar_target(name = cheap, command = g())
) you could consider something like: library(targets)
list(
tar_target(name = expensive, command = f()),
tar_target(
name = cheap,
command = {
expensive
g()
}
)
) or a different variation of this that doesn't load library(targets)
list(
tar_target(name = expensive, command = f()),
tar_target(
name = sentinel,
command = {
expensive
NULL
}
),
tar_target(
name = cheap,
command = {
sentinel
g()
}
)
) |
Beta Was this translation helpful? Give feedback.
Successive
tar_make()
calls are totally fine. If you prefer a single one, then you might consider restructuring the dependency graph to force certain targets to run after one another. For example, instead of:you could consider something like:
or a different variation of this that doesn't load
expensive
insidecheap
: