These changes invalidate certain targets in a pipeline and cause them to rerun on the next tar_make()
.
- Exclude function signatures from
tar_repository_cas()
output strings to reduce the size of pipeline metadata (#1390). - Exclude function signatures from
tar_format()
output strings to reduce the size of pipeline metadata (#1390).
tar_make()
and tar_outdated()
run much faster in this release. Extensive profiling was done on a real-world simulation pipeline with 66002 up-to-date targets. For tar_make()
using all the default settings:
Machine | Before (seconds) | After (seconds) | Speedup |
---|---|---|---|
M2 Macbook | 413.16 | 35.538 | 11.62587 |
RHEL9 | 450.66 | 94.08 | 4.790 |
And for tar_outdated()
using all the default settings
Machine | Before (seconds) | After (seconds) | Speedup |
---|---|---|---|
M2 Macbook | 91.314 | 16.636 | 5.48894 |
RHEL9 | 167.809 | 37.395 | 4.487472 |
To take advantage of these speed gains for an existing pipeline, you may have to run tar_make()
to convert the time stamps and file sizes to a new format. This initial tar_make()
is slow, but subsequent tar_make()
calls should be much faster than before the upgrade.
- Speed up
tar_make()
andtar_outdated()
by avoiding excessive buffering and disk writes for metadata and reporters when the pipeline is just skipping targets. - Use a more lookup-efficient data structure for
tar_runtime$file_info
(#1398). - Fall back on vector aggregation without names (#1401, @guglicap).
- Speed up representation of file sizes in metadata (#1408).
- Add a new
"forecast_interactive"
reporter totar_outdated()
to choose"forecast"
for interactive sessions and"silent"
for non-interactive ones. - Add a new
seconds_reporter_outdated
argument totar_config_set()
with a default of 1 to control the time interval of the reporter oftar_outdated()
and other passive algorithm functions. - Remove target descriptions from the default labels of graph visualizations.
- Allow branch references to contain multi-element
path
vectors with cloud metadata (#1382, @n8layman). - Avoid partial matches in internal code (#1384, @olivroy).
- Add error handling around calls to
ps::ps_disk_partitions()
andps::ps_fs_mount_point()
. - Do not store
_targets/objects/
paths in metadata for CAS repositories (#1391).
- Ensure compatibility with
igraph
>= 2.1.2.
- Un-break workflows that use
format = "file_fast"
(#1339, @koefoeden). - Fix deadlock in
error = "trim"
(#1340, @koefoeden). - Remove tailored debugging message (#1341, @koefoeden).
- Store warnings while writing to storage (#1345, @Aariq).
- Allow
garbage_collection
to be a non-negative integer to control the frequency of garbage collection in a performant, convenient, unified way (#1351). - Deprecate the
garbage_collection
argument oftar_make()
,tar_make_future()
, andtar_make_clusterm()
(#1351). - Instrument
target_run()
,target_prepare()
, andtarget_conclude()
usingautometric
. - Avoid sending problematic error classes such as
"vctrs_error_subscript_oob"
torlang::abort()
(#1354, @Jiefei-Wang). - Reduce memory consumption by ~23% in large pipelines by avoiding the accumulation of promise objects (#1352).
- Avoid
store_assert_format()
andstore_convert_object()
isstorage
is"none"
. - Add a
list()
method totar_repository_cas()
to make it easier and more efficient to specify custom CAS repositories (#1366). - Improve speed and reduce memory consumption by avoiding deep copies of inner environments of target definition objects (#1368).
- Reduce memory consumption by storing buds and branches as lightweight references when
memory
is"transient"
(#1364). - Replace the
memory
class with the newlookup
class. - Implement
memory = "auto"
to select transient memory for dynamic branches and persistent memory for other targets (#1371). - Omit whole pattern targets from branch subpipelines when possible. Should reduce memory consumption in some cases.
- Omit whole stem targets from branch subpipelines when
retrieval
is"main"
and only a bud is actually used. The same cannot be done with branches because each branch may need to be (un)marshaled individually. - Compress branches into references when
retrieval
is"worker"
and the whole pattern is part of the subpipeline. - Avoid duplicated branch aggregation: just send the branches over the network.
- Back-compatibly switch
format = "qs"
fromqs
toqs2
(#1373). - Add
tar_unblock_process()
.
- Add
"keepNA"
and"keepInteger"
to.deparseOpts()
(#1375). This may cause existing pipelines to rerun, but it makes add-ons liketarchetypes::tar_map()
much easier to use.
- Wrap
tar_watch()
UI module inbslib::page()
(#1302, @kwbyron-lilly). - Remove
callr_function
intar_make_as_job()
argument list. - Ensure
storage = "worker"
is respected when the process of storing an object generates an error (#1304, @multimeric). - Default to the
_targets.R
pattern intar_branches()
(#1306, @multimeric, @mattwarkentin). - Remove superfluous functions and globals from metadata with
tar_prune()
(#1312, @benzipperer). - Change the default
workspace_on_error
option toTRUE
(#1310, @hadley). - Enhance and organize the
error = "stop"
error message. - Avoid saving a file in
_targets/objects
forerror = "null"
. Instead, switch to a special"null"
storage format class iferror
is"null"
the target throws an error. This should allow users to more freely create new formats withtar_format()
without worrying about how to handleNULL
objects created byerror = "null"
. - Implement
format = "auto"
(#1311, @hadley). - Replace
pingr
dependency withbase::socketConnection()
for local URL utilities (#1317, #1318, @Adafede). - Implement
tar_repository_cas()
,tar_repository_cas_local()
, andtar_repository_cas_local_gc()
for content-addressable storage (#1232, #1314, @noamross). - Add
tar_format_get()
to make implementing CAS systems easier. - Implement
error = "trim"
intar_target()
andtar_option_set()
(#1310, #1311, @hadley). - Use the file system type to decide whether to trust time stamps (#1315, @hadley, @gaborcsardi).
- Deprecate
format = "file_fast"
in favor of the above (#1315). - Deprecate
trust_object_timestamps
in favor of the more unifiedtrust_timestamps
intar_option_set()
(#1315). - Print storage size of each target in verbose reporters (#1337, @psychelzh).
- Combine help files of
tar_target()
andtar_target_raw()
. Same withtar_load()
andtar_load_raw()
. - Add a
substitute
argument totar_format()
to make it easier to write custom storage formats without metaprogramming.
- Use
bslib
intar_watch()
. - Speed up
target_upstream_edges()
andpipeline_upstream_edges()
by avoiding data frames until the last minute (17% speedup for certain kinds of large pipelines). - Automatically set
as_job
toFALSE
intar_make()
ifrstudioapi
and/or RStudio is not available.
- Use
secretbase::siphash13()
instead ofdigest(algo = "xxhash64", serializationVersion = 3)
so hashes of in-memory objects no longer depend on serialization version 3 headers (#1244, @shikokuchuo). Unfortunately, pipelines built with earlier versions oftargets
will need to rerun.
- Ensure patterns marshal properly (#1266, #1264, njtierney/geotargets#52, @Aariq, @njtierney).
- Inform and prompt the user when the pipeline was built with an old version of
targets
and changes to the package will cause the current work to rerun (#1244). For thetar_make*()
functions,utils::menu()
prompts the user to give people a chance to downgrade if necessary. - For type safety in the internal database class, read all columns as character vectors in
data.table::fread()
, then convert them to the correct types afterwards. - Add a new
tar_resources_custom_format()
function which can pass environment variables to customize the behavior of customtar_format()
storage formats (#1263, #1232, @Aariq, @noamross). - Only marshal dependencies if actually sending the target to a parallel worker.
- Modernize
extras
intar_renv()
. tar_target()
gains adescription
argument for free-form text describing what the target is about (#1230, #1235, #1236, @tjmahr).tar_visnetwork()
,tar_glimpse()
,tar_network()
,tar_mermaid()
, andtar_manifest()
now optionally show target descriptions (#1230, #1235, #1236, @tjmahr).tar_described_as()
is a new wrapper aroundtidyselect::any_of()
to select specific subsets of targets based on the description rather than the name (#1136, #1196, @noamross, @mattmoo).- Fix the documentation of the
names
argument (nudge users towardtidyselect
expressions). - Make assertions on the pipeline process more robust (to check if two processes are trying to access the same data store).
- Avoid
arrow
-related CRAN check NOTE. use_targets()
only writes the_targets.R
script. Therun.sh
andrun.R
scripts are superseded by theas_job
argument oftar_make()
. Users not using the RStudio IDE can calltar_make()
withcallr_function = callr::r_bg
to run the pipeline as a background process.tar_make_clustermq()
andtar_make_future()
are superseded in favortar_make(use_crew = TRUE)
, so template files are no longer written for the former automatically.
Because of the changes below, upgrading to this version of targets
will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
- In
tar_seed_create()
, usesecretbase::sha3(x = TARGET_NAME, bits = 32L, convert = NA)
to generate target seeds that are more resistant to overlapping RNG streams (#1139, @shikokuchuo). The previous approach used a less rigorous combination ofdigest::digest(algo = "sha512")
anddigets::digest2int()
.
- Update the documentation of the
deployment
argument oftar_target()
to reflect the advent ofcrew
(#1208, @psychelzh). - Unset
cli.num_colors
on exit intar_error()
andtar_warning()
(#1210, @dipterix). - Do not try to access
seconds_timeout
if thecrew
controller is actually a controller group (#1207, wlandau/crew.cluster#35, @stemangiola, @drejom). tar_make()
gains anas_job
argument to optionally run atargets
pipeline as an RStudio job.- Bump required
igraph
version to 2.0.0 becauseigraph::get.edgelist()
was deprecated in favor ofigraph::as_edgelist()
. - Do not dispatch targets to backlogged
crew
controllers (or controller groups) (#1220). Use the newpush_backlog()
andpop_backlog()
crew
methods to make this smooth. - Make the debugger message more generic (#1223, @eliocamp).
- Throw an early and informative error from
tar_make()
if there is already atargets
pipeline running on a local process on the same local data store. The local process is detected using the process ID and time stamp fromtar_process()
(with a 1.01-second tolerance for the time stamp). - Remove
pkgload::load_all()
warning (#1218). Tried using.__DEVTOOLS__
but it interferes with reverse dependencies. - Add documentation and an assertion in
tar_target_raw()
to let users know thatiteration = "group"
is invalid for dynamic targets (ones withpattern = map(...)
etc.; #1226, @bmfazio).
- Print "errored pipeline" when at least one target errors.
- Bump minimum
clustermq
version to 0.9.2. - Repair the
tar_debug_instructions()
tips for when commands are long. - Do not look for dependencies of primitive functions (#1200, @smwindecker, @joelnitta).
Because of the changes below, upgrading to this version of targets
will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
- Use SHA512 during the creation of target-specific pseudo-random number generator seeds (#1139). This change decreases the risk of overlapping/correlated random number generator streams. See the "RNG overlap" section of the
tar_seed_create()
help file for details and justification. Unfortunately, this change will invalidate all currently built targets because the seeds will be different. To avoid rerunning your whole pipeline, setcue = tar_cue(seed = FALSE)
intar_target()
. - For cloud storage: instead of the hash of the local file, use the ETag for AWS S3 targets and the MD5 hash for GCP GCS targets (#1172). Sanitize with
targets:::digest_chr64()
in both cases before storing the result in the metadata. - For a cloud target to be truly up to date, the hash in the metadata now needs to match the current object in the bucket, not the version recorded in the metadata (#1172). In other words,
targets
now tries to ensure that the up-to-date data objects in the cloud are in their newest versions. So if you roll back the metadata to an older version, you will still be able to access historical data versions with e.g.tar_read()
, but the pipeline will no longer be up to date.
- Add a new exported function
tar_seed_create()
which creates target-specific pseudo-random number generator seeds. - Add an "RNG overlap" section in the
tar_seed_create()
help file to justify and defend howtargets
andtarchetypes
approach pseudo-random numbers. - Add function
tar_seed_set()
which sets a seed and sets all the RNG algorithms to their defaults in the R installation of the user. Each target now usestar_seed_set()
function to set its seed before running its R command (#1139). - Deprecate
tar_seed()
in favor of the newtar_seed_get()
function.
- For all cloud targets, check hashes in batched LIST requests instead of individual HEAD requests (#1172). Dramatically speeds up the process of checking if cloud targets are up to date.
- For AWS S3 targets,
tar_delete()
,tar_destroy()
, andtar_prune()
now use efficient batched calls todelete_objects()
instead of costly individual calls todelete_object()
(#1171). - Add a new
verbose
argument totar_delete()
,tar_destroy()
, andtar_prune()
. - Add a new
batch_size
argument totar_delete()
,tar_destroy()
, andtar_prune()
. - Add new arguments
page_size
andverbose
totar_resources_aws()
(#1172). - Add a new
tar_unversion()
function to remove version IDs from the metadata of cloud targets. This makes it easier to interact with just the current version of each target, as opposed to the version ID recorded in the local metadata.
- Migrate to the changes in
clustermq
0.9.0 (@mschubert). - In progress statuses, change "started" to "dispatched" and change "built" to "completed" (#1192).
- Deprecate
tar_started()
in favor oftar_dispatched()
(#1192). - Deprecate
tar_built()
in favor oftar_completed()
(#1192). - Console messages from reporters say "dispatched" and "completed" instead of "started" and "built" (#1192).
- The
crew
scheduling algorithm no longer waits on saturated controllers, and targets that are ready are greedily dispatched tocrew
even if all workers are busy (#1182, #1192). To appropriately set expectations for users, reporters print "dispatched (pending)" instead of "dispatched" if the task load is backlogged at the moment. - In the
crew
scheduling algorithm, waiting for tasks is now a truly event-driven process and consumes 5-10x less CPU resources (#1183). Only the auto-scaling of workers uses polling (with an inexpensive default polling interval of 0.5 seconds, configurable throughseconds_interval
in the controller). - Simplify stored target tracebacks.
- Print the traceback on error.
- Try to fix function help files for CRAN.
- Add
tar_config_projects()
andtar_config_yaml()
(#1153, @psychelzh). - Apply error modes to
builder_wait_correct_hash()
intarget_conclude.tar_builder()
(#1154, @gadenbuie). - Remove duplicated error message from
builder_error_null()
. - Allow
tar_meta_upload()
andtar_meta_download()
to avoid errors if one or more metadata files do not exist. Add a new argumentstrict
to control error behavior. - Add new arguments
meta
,progress
,process
, andcrew
to control individual metadata files intar_meta_upload()
,tar_meta_download()
,tar_meta_sync()
, andtar_meta_delete()
. - Avoid newly deprecated arguments and functions in
crew
0.5.0.9003 (https://github.com/wlnadau/crew/issues/131). - Allow
tar_read()
etc. inside a pipeline whenever it uses a different data store (#1158, @MilesMcBain). - Set
seed = FALSE
infuture::future()
(#1166, @svraka). - Add a new
physics
argument totar_visnetwork()
andtar_glimpse()
(#925, @Bdblodgett-usgs).
Because of these changes, upgrading to this version of targets
will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.
- In the
hash_deps()
method of the metadata class, exclude symbols which are not actually dependencies, rather than just giving them empty strings. This change decouples the dependency hash from the hash of the target's command (#1108).
- Continuously upload metadata files to the cloud during
tar_make()
,tar_make_clustermq()
, andtar_make_future()
(#1109). Upload them to the repository specified in therepository_meta
tar_option_set()
option, and use the bucket and prefix set in theresources
tar_option_set()
option.repository_meta
defaults to the existingrepository
tar_option_set()
option. - Add new functions
tar_meta_download()
,tar_meta_upload()
,tar_meta_sync()
, andtar_meta_delete()
to directly manage cloud metadata outside the pipeline (#1109).
- Fix solution of #1103 so the copy fallback actually runs (@jds485, #1102, #1103).
- Switch back to
tempdir()
for #1103. - Move
path_scratch_dir_network()
tofile.path(tempdir(), "targets")
and make suretar_destroy("all")
andtar_destroy("cloud")
delete it. - Display
tar_mermaid()
subgraphs with transparent fills and black borders. - Allow
database$get_data()
to work with list columns. - Disallow functions that access the local data store (including metadata) from inside a target while the pipeline is running (#1055, #1063). The only exception to this is local file targets such as
tarchetypes
literate programming target factories liketar_render()
andtar_quarto()
. - In the
hash_deps()
method of the metadata class, use a new customsort_chr()
function which temporarily sets theLC_COLLATE
locale to"C"
for sorting. This ensures lexicographic comparisons are consistent across platforms (#1108). - In
tar_source()
, use thefile
argument andkeep.source = TRUE
to help with interactive debugging (#1120). - Deprecated
seconds_interval
intar_config_get()
,tar_make()
,tar_make_clustermq()
andtar_make_future()
. Replace it withseconds_meta
(to control how often metadata gets saved) andseconds_reporter
(to control how often to print messages to the R console) (#1119). - Respect
seconds_meta
andseconds_reporter
for writing metadata and console messages even for currently building targets (#1055). - Retry all cloud REST API calls with HTTP error codes (429, 500-599) with the exponential backoff algorithm from
googleAuthR
(#1112). - For
format = "url"
, only retry on the HTTP error codes above. - Make cloud temp file instances unique in order to avoid file conflicts with the same target.
- Un-deprecate
seconds_interval
andseconds_timeout
fromtar_resources_url()
, and implementmax_tries
arguments intar_resources_aws()
andtar_resources_gcp()
(#1127). - Use
file
andkeep.source
inparse()
incallr
utils and target Markdown. - Automatically convert
"file_fast"
format to"file"
format for cloud targets. - In
tar_prune()
andtar_delete()
, do not try to delete pattern targets which have no cloud storage. - Add new arguments
seconds_timeout
,close_connection
,s3_force_path_style
totar_resources_aws()
to support the analogous arguments inpaws.storage::s3()
(#1134, @snowpong).
- Fix a documentation issue for CRAN.
- Add
tar_prune_list()
(#1090, @mglev1n). - Wrap
file.rename()
intryCatch()
and fall back on a copy-then-remove workaround (@jds485, #1102, #1103). - Stage temporary cloud upload/download files in
tools::R_user_dir(package = "targets", which = "cache")
instead oftempdir()
.tar_destroy(destroy = "cloud")
andtar_destroy(destroy = "all")
remove any leftover files from failed uploads/downloads (@jds485, #1102, #1103). - Use
paws.storage
instead of all ofpaws
.
- Do not assume S3 classes when validating
crew
controllers. - Suggest a crew controller in the
_targets.R
file fromuse_targets()
. - Make
tar_crew()
compatible withcrew
>= 0.3.0. - Rename argument
terminate
toterminate_controller
intar_make()
. - Add argument
use_crew
intar_make()
and add an option intar_config_set()
to make it configurable. - Write progress data and metadata in
target_prepare()
.
- Allow users to set the default
label
andlevel_separation
arguments throughtar_config_set()
(#1085, @Moohan).
- Decide on
nanonext
usage intime_seconds_local()
at runtime and not installation time. That way, ifnanonext
is removed aftertargets
is installed, functions intargets
still work. Fixes the CRAN issues seen intarchetypes
,jagstargets
, andgittargets
.
- Remove
crew
-related startup messages.
- Pre-compute
cli
colors and bullets to improve performance in RStudio. - Use
packageStartupMessage()
for package startup messages.
- Send targets to the appropriate controller in a controller group when
crew
is used.
- Call
gc()
more appropriately whengarbage_collection
isTRUE
intar_target()
. - Add
garbage_collection
arguments totar_make()
,tar_make_clustermq()
, andtar_make_future()
to add optional garbage collection before targets are sent to workers. This is different and independent from thegarbage_collection
argument oftar_target()
. In high-performance computing scenarios, the former controls what happens on the main controlling process, whereas the latter controls what happens on the worker. - Add
garbage_collection
andseconds_interval
arguments totar_make()
,tar_make_clustermq()
,tar_make_future()
, andtar_config_set()
. - Downsize the
tar_runtime
object. - Remove the 100 Kb file size cutoff for determining whether to trust the file timestamp or recompute the hash when checking if a file is up to date (#1062). Instate the
"file_fast"
format and thetrust_object_timestamps
option intar_option_set()
as safer alternatives. - Consolidate store constructors.
- Allow
crew
controller groups (#1065, @mglev1n). - Expose more exponential backoff configuration parameters through
tar_backoff()
. Thebackoff
argument oftar_option_set()
now accepts output fromtar_backoff()
, and supplying a numeric is deprecated. - Fix the exponential backoff rules in the
crew
scheduling algorithm. - Implement
tar_resources_network()
to configure retries and timeouts for internal HTTP/HTTPS requests in specialized targets withformat = "url"
,repository = "aws"
, andrepository = "gcp"
. Also applies to syncing target files across network file systems in the case ofstorage = "worker"
orformat = "file"
, which previously had a hard-codedseconds_interval = 0.1
andseconds_timeout = 60
. - Deprecate
seconds_interval
andseconds_timeout
intar_resources_url()
in favor of the new equivalent arguments oftar_resources_network()
- Safely withhold a target from its
crew
controller when the controller is saturated (#1074, @mglev1n). - Use exponential backoff when appending a target back to the queue in the case of a saturated
crew
controller. - Use native retries in
paws.common
(@DyfanJones).
- Cache info about all of
_targets/objects/
intar_callr_inner_try()
and update the cache as targets are saved to_targets/objects/
to avoid the overhead of repeated calls tofile.exists()
andfile.info()
(#1056). - Trust the timestamps by default when checking whether files in
_targets/objects/
are up to date (#1062).tar_option_set(trust_object_timestamps = FALSE)
ignores the timestamps and recomputes the hashes. - Write to
_targets/meta/meta
and_targets/meta/progress
in timed batches instead of line by line (#1055). - Reporters now print progress messages in timed batches instead of line by line (#1055).
- The summary and forecast reporters are much faster because they avoid going through data frames.
- Avoid
tempfile()
when working with the scratch directory. - Use
nanonext::mclock()
instead ofproc.time()
when there is no risk of forked processes. - Replace
withr
with slightly faster/leaner base R alternatives. - Efficiently catch changes to the working directory instead of overburdening the pipeline with calls to
setwd()
(#1057). - Invoke
tar_options
methods in the internals instead oftar_option_get()
. - Avoid
gsub()
instore_init()
. - Avoid repeated calls to
meta$get_record()
inbuilder_should_run()
. - Mock the store object when creating a record from a metadata row.
- Avoid
cli::col_none()
to reduce the number of ANSI characters printed to the R console.
targets
is moving to version 1.0.0 because it is significantly more mature than previous versions. Specifically,
tar_make()
now integrates withcrew
, which will significantly improve the waytargets
does high-performance computing going forward.- All other functionality in
targets
has stabilized. There is still room for smaller new features, but none as large ascrew
integration, none that will fundamentally change how the package operates.
- Support distributed computing through the
crew
package intar_make()
(#753).crew
itself is still in its early stages and currently lacks the launcher plugins to match theclustermq
andfuture
backends, but long-term,crew
will be the predominant high-performance computing backend.
- Add a new
store_copy_object()
to the store class to enable"fst_dt"
and other formats to make deep copies when needed (#1041, @MilesMcBain). - Add a new
copy
argument to allowtar_format()
formats to set thestore_copy_object()
method (#1041, @MilesMcBain). - Shorten the output string returned by
tar_format()
when default methods are used. - Add a
change_directory
argument totar_source()
(#1040, @dipterix). - In
format = "url"
targets, implement retries and timeouts when connecting to URLs. The default timeout is 10 seconds, and the default retry interval is 1 second. Both are configurable viatar_resources_url()
(#1048). - Use
parallelly::freePort()
intar_random_port()
. - Rename a target and a function in the
tar_script()
example pipeline (#1033, @b-rodrigues). - Edit the description.
- Handle encoding errors while trying to process error and warning messages (#1019, @adrian-quintario).
- Fix S3 generic/method consistency.
- Forward user-level custom error conditions to the top of the pipeline (#997, @alexverse).
- Link to the help page of the manual.
- Fix the command inserted for debug mode (#975).
- Set empty chunk options to ensure Target Markdown compatibility with the special "setup" chunk (#973, @KaiAragaki).
- Only store the first 50 warnings in the metadata, and cap the text of the warning messages at 2048 characters (#983, @thejokenott).
- Enhance the
tar_destroy()
help file (#988, @Sage0614). - Implement
destroy = "user"
intar_destroy()
.
- Move
#!/bin/sh
line to the top of SLURMclustermq
template file (#944, #955, @GiuseppeTT). - Add new function
tar_path_script()
. - Rename
tar_store()
totar_path_store()
with deprecation. - Rename
tar_path()
totar_path_target()
with deprecation. - Add new function
tar_path_script_support()
. - Make Target Markdown target scripts dynamically locate their support scripts so the appropriate scripts can be found even when they are generated from one directory and sourced from another (#953, #957, @TylerGrantSmith).
- Allow user-side control of the seeds at the pipeline level.
tar_option_set()
now supports aseed
argument, and target-specific seeds are determined bytar_option_get("seed")
and the target name.tar_option_set(seed = NA)
disables seed-setting behavior but forcibly invalidates all the affected targets except whenseed
isFALSE
in the target'star_cue()
(#882, @sworland-thyme, @joelnitta). - Implement a
seed
argument intar_cue()
to control whether targets update in response to changing orNA
seeds (#882, @sworland-thyme, @joelnitta). - Reduce the number of per-target AWS/GCP storage API calls. Previously there were 3 API calls per target, including 2 HEAD requests. Now there is just 1 for a typical target (unless dependencies have to be downloaded). Relies on S3 strong read-after-write consistency (#958).
- Update the
tar_github_actions()
workflow file to use@v2
(#960, @kulinar). - Print helpful hints while debugging a target interactively (#961).
- Only attempt to debug a target when
callr_function
isNULL
(#961). - Make formats
"feather"
,"parquet"
,"file"
, and"url"
work witherror = "null"
(#969). - Declare formats
"keras"
and"torch"
superseded bytar_format()
. Documented in thetar_target()
help file. - Declare formats
"keras"
and"torch"
incompatible witherror = "null"
. Documented in thetar_target()
help file and in a warning thrown bytar_target()
viatar_target_raw()
. - Add a
convert
argument totar_format()
to allow customstore_convert_object()
methods (#970).
- Use
any_of()
instead ofall_of()
in tests to ensure compatibility withtidyselect
1.1.2.9000 (#928, @hadley). - Make the
run.R
fromuse_targets()
executable (#929, @petrbouchal). - Add
#!/usr/bin/env Rscript
to the top ofrun.R
fromuse_targets()
(#929, @petrbouchal).
- Implement custom alternative to
skip_on_cran()
to avoid r-lib/testthat#1470 (comment). - Skip more tests on CRAN.
- Print "no targets found" when there are no targets in the pipeline to check or build, or if the
names
argument oftar_make()
does not identify any such targets in the pipeline (#923, @llrs). - Ignore
.packageName
,.__NAMESPACE__.
, and.__S3MethodsTable__.
when importing objects from packages with theimports
option oftar_option_set()
. - Import datasets from packages in the
imports
option oftar_option_set()
(#926, @joelnitta). - Print target-specific elapsed runtimes in the verbose and timestamp reporters.
- Improve error messages in functions like
tar_read()
andtar_load()
when the data store is missing.
- Do not incorrectly reference feather resources for parquet storage.
- Simplify and improve error handling.
- In the
command
column oftar_manifest()
output, separate lines with "\n" instead of "\n" so the text output is straightforward to work with. - Add a
drop_missing
argument totar_manifest()
to hide/show columns with allNA
values. - Do not set Parquet version.
- Fix reverse dependency checks.
- Do not bootstrap the junction of a stem unless the target is branched over (#858, @dipterix).
- For non-"file" AWS targets, immediately delete the scratch file after the target is uploaded (#889, @stuvet).
- Allow extra arguments to
paws
functions via...
intar_resources_aws()
(#855, @michkam89). - Add
tar_source()
to conveniently source R scripts (e.g. in_targets.R
).
- Color ordinary
targets
messages the default theme color, and color warnings and errors red (#856, @gorkang). - Automatically supply job names in the scripts generated by
use_targets()
. - Inherit resources one-by-one in nested fashion from
tar_option_get("resources")
(#892). See the revised"Resources"
section of thetar_resources()
help file for details.
- Add arguments
legend
andcolor
to further configuretar_mermaid()
(#848, @noamross). - For HPC schedulers like SLURM and SGE,
use_targets()
now creates ajob.sh
script to run the pipeline as a cluster job (#839).
- Use lapply() to source scripts in
use_targets()
. Avoids defining a global variable for the file. - Recursively find scripts to source in the
use_targets()
_targets.R
file. - Refactor error printing.
- Fix
tar_mermaid()
graph ordering. - Hash the node names and quote the label names of
tar_mermaid()
graphs to avoid JavaScript keywords. - Remove superfluous line breaks in the node labels of graph visuals.
- Fix metadata migration to version >= 0.10.0 (#812, @tjmahr).
data.table::fread()
with encoding equal togetOption("encoding")
if available (#814, @svraka). Only works with UTF-8 and latin1 because that is whatdata.table
supports.- Force add files in GitHub Actions workflow job (#815, @tarensanders).
use_targets()
now writes a_targets.R
file tailored to the project in the current working directory (#639, @noamross).- Move the old
use_targets()
touse_targets_rmd()
.
- Load packages when loading data for downstream targets in the pipeline (#713).
- Handle edge case when
getOption("OutDec")
is not"."
to prevent time stamps from being corrupted (#433, @jarauh). - Added helper function
tar_load_everything()
to quickly load all targets (#823, @malcolmbarrett)
- Print out the relevant target names if targets have conflicting names.
- Catch all the target warnings instead of just reporting the last one.
- Allow 200 group URL status codes instead of just 200 (#797, @petrbouchal).
- Add Google Cloud Storage via
tar_target(..., repository = "gcp")
(#720, @markedmondson1234). Special thanks to @markedmondson1234 for the cloud storage utilities inR/utils_gcp.R
mermaid.js
static graphs withtar_mermaid()
(#775, @yonicd).- Implement
tar_target(..., error = "null")
to allow errored targets to returnNULL
and continue (#807, @zoews). Errors are still registered, those targets are not up to date, and downstream targets have an easier time continuing on. - Implement
tar_assert_finite()
. tar_destroy()
,tar_delete()
, andtar_prune()
now attempt to delete cloud data for the appropriate targets (#799). In addition,tar_exist_objects()
andtar_objects()
now report about target data in the cloud when applicable. Add a newcloud
argument to each function to optionally suppress this new behavior.- Add a
zoom_speed
argument totar_visnetwork()
andtar_glimpse()
(#749, @dipterix). - Report the total runtime of the pipeline in the
"verbose"
,"verbose_positives"
,"timestamp"
, and"timesamp_positives"
reporters.
- Allow target name character strings to have attributes (#758, @psanker).
- Sort metadata rows when the pipeline finishes so that version-controlling the metadata is easier (#766, @jameelalsalam).
- Deprecate the
"aws_*"
storage format values in favor of a newrepository
argument (#803). In other words,tar_target(..., format = "aws_qs")
is nowtar_target(..., format = "qs", repository = "aws")
. And internally, storage classes with multiple inheritance are created dynamically as opposed to having hard-coded source files. All this paves the way to add new cloud storage platforms without combinatorial chaos.
- Add class
"tar_nonexportable"
toformat = "aws_keras"
andformat = "aws_torch"
stores. - Export S3 methods of generic
tar_make_interactive_load_target()
.
- Allow entirely custom storage formats through
tar_target(format = tar_format(...))
(#736). - Add a new function
tar_call()
to return thetargets
function currently running (from_targets.R
or a target). - Add a new function
tar_active()
to tell whether the pipeline is currently running. Detects if it is called fromtar_make()
or similar function.
- Add
Sys.getenv("TAR_PROJECT")
to the output oftar_envvars()
. - Set the
store
field oftar_runtime
prior to sourcing_targets.R
sotar_store()
works in target scripts. - Explicitly export all the environment variables from
tar_envvars()
to targets run on parallel workers. - Allow
format = "file"
targets to returncharacter(0)
(#728, @programLyrique). - Automatically remove non-targets from the target list and improve target list error messages (#731, @billdenney).
- Link to resources on deploying to RStudio Connect (#745, @ian-flores).
- Mask pointers in function dependencies (#721, @matthiaskaeding)
- Track the version ID of AWS S3-backed targets if the bucket is version-enabled (#711). If you put your targets in AWS and the metadata and code under version control, you can
git checkout
a different branch of your code and all you targets will stay up to date. - Refactor the AWS path format internally. It now consists of arbitrarily extensible key-value pairs so more AWS S3 functionality may be added more seamlessly going forward (#711).
- Switch the AWS S3 backend to
paws
(#711).
- Add a
region
argument totar_resources_aws()
to allow the user to explicitly declare a region for each AWS S3 buckets (@caewok, #681). Different buckets can now have different regions. This feature required modifying the metadata path for AWS storage formats. Before, the first element of the path was simply the bucket name. Now, it is internally formatted like"bucket=BUCKET:region=REGION"
, whereBUCKET
is the user-supplied bucket name andREGION
is the user-supplied region name. The newtargets
is back-compatible with the old metadata format, but if you run the pipeline withtargets
>= 0.8.1.9000 and then downgrade totargets
<= 0.8.1, any AWS targets will break. - Add new reporters
timestamp_positives"
and"verbose_positives"
that omit messages for skipped targets (@psanker, #683). - Implement
tar_assert_file()
. - Implement
tar_reprex()
for creating easier reproducible examples of pipelines. - Implement
tar_store()
to get the path to the store of the currently running pipeline (#714, @MilesMcBain). - Automatically write a
_targets/user/
folder to encouragegittargets
users to put custom files there for data version control.
- Make sure
tar_path()
uses the current store path of the currently running pipeline instead oftar_config_get("store")
(#714, @MilesMcBain).
- Refactor the automatic
.gitignore
file inside the data store to allow the metadata to be committed to version control more easily (#685, #711). - Document target name requirements in
tar_target()
andtar_target_raw()
(@tjmahr, #679). - Catch and relay any the error if a target cannot be checked in
target_should_run.tar_builder()
. These kinds of errors sometimes come up with AWS storage. - Fix the documentation of the reporters.
- Only write
_targets/.gitignore
for new data stores so the user can delete the.gitignore
file without it mysteriously reappearing (#685).
- Add arguments
strict
andsilent
to allowtar_load()
andtar_load_raw()
to bypass targets that cannot be loaded.
- Improve
tidyselect
docs intar_make()
(#640, @dewoller). - Use namespaced call to
tar_dir()
intar_test()
(#642, @billdenney). - Improve
tar_assert_target_list()
error message (@kkami1115, #654). - Throw an informative error if a target name starts with a dot (@dipterix, #662).
- Improve help files of
tar_destroy()
and related cleanup functions (@billdenney, #675).
- Hash the correct files in
tar_target(target_name, ..., format = "aws_file")
. Previously,_targets/objects/target_name
was also hashed if it existed.
- Implement a new
tar_config_unset()
function to delete one or more configuration settings from the YAML configuration file. - Implement the
TAR_CONFIG
environment variable to set the default file path of the YAML configuration file with project settings (#622, @yyzeng, @atusy, @nsheff, @wdkrnls). IfTAR_CONFIG
is not set, the file path is still_targets.yaml
. - Restructure the YAML configuration file format to handle configuration information for multiple projects (using the
config
package) and support theTAR_PROJECT
environment variable to select the current active project for a given R session. The old single-project format is gracefully deprecated (#622, @yyzeng, @atusy, @nsheff, @wdkrnls). - Implement
retrieval = "none"
andstorage = "none"
to anticipate loading/saving targets from other languages, e.g. Julia (@MilesMcBain). - Add a new
tar_definition()
function to get the target definition object of the current target while that target is running in a pipeline. - If called inside an AWS target,
tar_path()
now returns the path to the staging file instead of_targets/objects/target_name
. This ensures you can still write totar_path()
instorage = "none"
targets and the package will automatically hash the right file and upload it to the cloud. (This behavior does not apply to formats"file"
and"aws_file"
, where it is never necessary to setstorage = "none"
.)
- Use
eval(parse(text = ...), envir = tar_option_set("envir")
instead ofsource()
in the_targets.R
file for Target Markdown. - Allow feather and parquet formats to accept objects of class
RecordBatch
andTable
(@MilesMcBain). - Let
knitr
load the Target Markdown engine (#469, @nviets, @yihui). Minimumknitr
version is now1.34
. - In the
tar_resources_future()
help file, encourage the use ofplan
to specify resources.
- Ensure
error = "continue"
does not cause errored targets to haveNULL
values. - Relay output and messages in Target Markdown interactive mode (using the R/default
knitr
engine).
- Expose the
poll_connection
,stdout
, andstderr
arguments ofcallr::r_bg()
intar_watch()
(@mpadge). - Add new helper functions to list targets in each progress category:
tar_started()
,tar_skipped()
,tar_built()
,tar_canceled()
, andtar_errored()
. - Add new helper functions
tar_interactive()
,tar_noninteractive()
, andtar_toggle()
to differentially suppress code in non-interactive and interactive mode in Target Markdown (#607, @33Vito).
- Handle
future
errors within targets (#570, @stuvet). - Handle storage errors within targets (#571, @stuvet).
- In Target Markdown in non-interactive mode, suppress messages if the
message
knitr
chunk option isFALSE
(#574, @jmbuhr). - In Target Markdown, if
tar_interactive
is not set, choose interactive vs non-interactive mode based onisTRUE(getOption("knitr.in.progress"))
instead ofinteractive()
. - Convert errors loading dependencies into errors running targets (@stuvet).
- Allow
tar_poll()
to lose and then regain connection to the progress file. - Make sure changes to the
tar_group
column ofiteration = "group"
data frames do not invalidate slices (#507, @lindsayplatt).
- In Target Markdown, add a new
tar_interactive
global option to select interactive mode or non-interactive mode (#469). - Highlight a graph neighborhood when the user clicks a node. Control the neighborhood degree with new arguments
degree_from
anddegree_to
oftar_visnetwork()
andtar_glimpse()
(#474, @rgayler). - Make the target script path configurable in
tar_config_set()
(#476). - Add a
tar_script
chunk option in Target Markdown to control where the{targets}
language engine writes the target script and helper scripts (#478). - Add new arguments
script
andstore
to choose custom paths to the target script file and data store for individual function calls (#477). - Allow users to set an alternative path to the YAML configuration file for the current R session (#477). Most users have no reason to set this path, it is only for niche applications like Shiny apps with
targets
backends. Unavoidably, the path gets reset to_targets.yaml
when the session restarts. - Add new
_targets.yaml
config optionsreporter_make
,reporter_outdated
, andworkers
to control function argument defaults shared across multiple functions called outside_targets.R
(#498, @ianeveperry). - Add
tar_load_globals()
for debugging, testing, prototyping, and teaching (#496, @malcolmbarrett). - Add structure to the
resources
argument oftar_target()
to avoid conflicts among formats and HPC backends (#489). Includes user-side helper functions liketar_resources()
andtar_resources_aws()
to build the required data structures. - Log skipped targets in
_targets/meta/progress
and display then intar_progress()
,tar_poll()
,tar_watch()
,tar_progress_branches()
,tar_progress_summary()
, andtar_visnetwork()
(#514). Instead of writing each skip line separately to_targets/meta/progress
, accumulate skip lines in a queue and then write them all out in bulk when something interesting happens. This avoids a lot of overhead in certain cases. - Add a
shortcut
argument totar_make()
,tar_make_clustermq()
,tar_make_future()
,tar_outdated()
, andtar_sitrep()
to more efficiently skip parts of the pipeline (#522, #523, @jennysjaarda, @MilesMcBain, @kendonB). - Support
names
andshortcut
in graph data frames and graph visuals (#529). - Move
allow
andexclude
to the network behind the graph visuals rather than the visuals themselves (#529). - Add a new "progress" display to the
tar_watch()
app to show verbose progress info and metadata. - Add a new
workspace_on_error
argument oftar_option_set()
to supersedeerror = "workspace"
. Helps control workspace behavior independently of theerror
argument oftar_target()
(#405, #533, #534, @mattwarkentin, @xinstein). - Implement
error = "abridge"
intar_target()
and related functions. If a target errors out with this option, the target itself stops, any currently running targets keeps, and no new targets launch after that (#533, #534, @xinstein). - Add a menu prompt to
tar_destroy()
which can be suppressed withTAR_ASK = "false"
(#542, @gofford). - Support functions
tar_older()
andtar_newer()
to help users identify and invalidate targets at regular times or intervals.
- In Target Markdown, deprecate the
targets
chunk option in favor oftar_globals
(#469). - Deprecate
error = "workspace"
intar_target()
and related functions. Usetar_option_set(workspace_on_error = TRUE)
instead (#405, #533, @mattwarkentin, @xinstein).
- Reset the backoff upper bound when concluding a target or shutting down a
clustermq
worker (@rich-payne). - Set more aggressive default backoff bound of 0.1 seconds (previous: 5 seconds) and set a more aggressive minimum of 0.001 seconds (previous: 0.01 seconds) (@rich-payne).
- Speed up the summary and forecast reporters by only printing to the console every quarter second.
- Avoid superfluous calls to
store_sync_file_meta.default()
on small files. - In
tar_watch()
, take several measures to avoid long computation times rendering the graph:- Expose arguments
display
anddisplays
totar_watch()
so the user can select which display shows first. - Make
"summary"
the default display instead of"graph"
. - Set
outdated
toFALSE
by default.
- Expose arguments
- Simplify the Target Markdown example.
- Warn about unnamed chunks in Target Markdown.
- Redesign option system to be more object-oriented and rigorous. Also export most options to HPC workers (#475).
- Simplify config system to let API function arguments take control (#483).
- In
tar_read()
for targets withformat = "aws_file"
, download the file back to the path the user originally saved it when the target ran. - Replace the
TAR_MAKE_REPORTER
environment variable withtargets::tar_config_get("reporter_make")
. - Use
eval(parse(text = readLines("_targets.R")), envir = some_envir)
and related techniques instead of the less controllablesource()
. Expose anenvir
argument to many functions for further control over evaluation ifcallr_function
isNULL
. - Drop
out.attrs
when hashing groups of data frames to extend #507 toexpand.grid()
(#508). - Increase the number of characters in errors and warnings up to 2048.
- Refactor assertions to automatically generate better messages.
- Export assertions, conditions, and language utilities in packages that build on top of
targets
. - Change
GITHUBPAT
toGITHUB_TOKEN
in thetar_github_actions()
YAML file (#554, @eveyp). - Support the
eval
chunk option in Target Markdown (#552, @fkohrt). - Record time stamps in the metadata
time
column for all builder targets, regardless of storage format.
- Export in-memory config settings from
_targets.yaml
to parallel workers.
- Add a limited-scope
exclude
argument totar_watch()
andtar_watch_server()
(#458, @gorkang). - Write a
.gitignore
file to ignore everything in_targets/meta/
except.gitignore
and_targets/meta/meta
. - Target Markdown: add
knitr
engines for pipeline construction and prototyping from within literate programming documents (#469, @cderv, @nviets, @emilyriederer, @ijlyttle, @GShotwell, @gadenbuie, @tomsing1). Huge thanks to @cderv on this one for answering my deluge of questions, helping me figure out what was and was not possible inknitr
, and ultimately circling me back to a successful approach. - Add an RStudio R Markdown template for Target Markdown (#469).
- Implement
use_targets()
, which writes the Target Markdown template to the project root (#469). - Implement
tar_unscript()
to clean up scripts written by Target Markdown.
- Enable priorities in
tar_make()
andtar_manifest()
. - Show the priority in the print method of stem and pattern targets.
- Throw informative errors if the secondary arguments to
pattern = slice()
orpattern = sample()
are invalid. - In
tar_target_raw()
, assert that commands have length 1 when converted to expressions. - Handle errors and post failure artifacts in the Github Actions YAML file.
- Rewrite the documentation on invalidation rules in
tar_cue()
(@maelle). - Drop
dplyr
groups and"grouped_df"
class intar_group()
(tarchetypes
discussion #53, @kendonB). - Assign branch names to dynamic branching return values produced by
tar_read()
andtar_read_raw()
.
- Do not use time stamps to monitor the config file (e.g.
_targets.yaml
). Fixes CRAN check errors from version 0.4.1.
- Fix CRAN test error on Windows R-devel.
- Do not inherit
roxygen2
docstrings fromshiny
. - Handle more missing
Suggests:
packages. - Unset the config lock before reading
targets.yaml
in thecallr
process.
- Avoid
file.rename()
errors when migrating staged temporary files (#410). - Return correct error messages from feather and parquet formats (#388). Now calling
assert_df()
fromstore_assert_format()
instead ofstore_cast_object()
. And now those last two functions are not called at all if the target throws an error. - Retry writing lines to database files so Windows machines can run
tar_poll()
at the same time as the pipeline (#393). - Rename file written by
tar_renv()
to_targets_packages.R
(#397). - Ensure metadata is loaded to compute labels properly when
outdated = FALSE
intar_visnetwork()
.
- Implement
tar_timestamp()
andtar_timestamp_raw()
to get the last modified timestamp of a target's data (#378). - Implement
tar_progress_summary()
to compactly summarize all pipeline progress (#380). - Add a
characters
argument oftar_traceback()
to cap the traceback line lengths (#383). - Add new "summary" and "about" views to
tar_watch()
(#382). - Implement
tar_poll()
to repeatedly poll runtime progress in the R console (#381).tar_poll()
is a lightweight alternative totar_watch()
. - Change the color of the "dormant" status in the graph.
- Add a
tar_envvar()
function to list values of special environment variables supported intargets
. The help file explains each environment variable in detail. - Support extra project-level configuration settings with
_targets.yaml
(#297). New functionstar_config_get()
andtar_config_set()
interact with the_targets.yaml
file. Currently only supports thestore
field to set the data store path to something other than_targets/
.
- Shut down superfluous persistent workers earlier in dynamic branching and when all remaining targets have
deployment = "main"
(#398, #399, #404, @pat-s).
- Attempt to print only the useful part of the traceback in
tar_traceback()
(#383). - Add a line break at the end of the "summary" reporter so warnings do not mangle the output.
- In
tar_watch()
, useshinybusy
instead ofshinycssloaders
and keep current output on display while new output is rendering (#386, @rcorty). - Right-align the headers and counts in the "summary" and "forecast" reporters.
- Add a timestamp to the "summary" reporter.
- Make the reporters show when a target ends (#391, @mattwarkentin).
- Make the reporters show when a pattern ends if the pattern built at least one target and none of the targets errored or canceled.
- Use words "start" and "built" in reporters.
- Use the region of the AWS S3 bucket instead of the local
AWS_DEFAULT_REGION
environment variable (check_region = TRUE
; #400, @tomsing1). - In
tar_meta()
, returnPOSIXct
times in the time zone of the calling system (#131). - Throw informative error messages when a target's name or command is missing (#413, @liutiming).
- Bring back ALTREP in
qs::qread()
now thatqs
0.24.1 requiresstringfish
>= 1.5.0 (#147, @glep). - Relax dynamic branching checks so
pattern = slice(...)
can take multiple indexes (#406, #419, @djbirke, @alexgphayes)
queue$enqueue()
is nowqueue$prepend()
and always appends to the front of the queue (#371).
- Throw a warning if
devtools::load_all()
or similar is detected inside_targets.R
(#374).
- Skip
feather
andparquet
tests on CRAN.
- Fix the "write target at cursor" RStudio addin and move cursor between the parentheses.
- Add a
backoff
option intar_option_set()
to set the maximum upper bound (seconds) for the polling interval (#333). - Add a new
tar_github_actions()
function to write a GitHub Actions workflow file for continuous deployment of data analysis pipelines (#339, @jaredlander). - Add a new
TAR_MAKE_REPORTER
environment variable to globally set the reporter of thetar_make*()
functions (#345, @alexpghayes). - Support new storage formats "feather", "parquet", "aws_feather", and "aws_parquet" (#355, @riazarbi).
- Implement an exponential backoff algorithm for polling the priority queue in
tar_make_clustermq()
andtar_make_future()
(#333). - In
tar_make_future()
, try to submit a target every time a worker is polled. - In
tar_make_future()
, poll workers in order of target priority. - Avoid the time delay in exiting on error (from r-lib/callr#185).
- Clone target objects for the pipeline and scrape more
targets
internal objects out of the environment in order to avoid accidental massive data transfers to workers.
- Use
rlang::check_installed()
insideassert_package()
(#331, @malcolmbarrett). - Allow
tar_destroy(destroy = "process")
. - In
tar_watch()
, increase defaultseconds
to 15 (previously 5). - In
tar_watch()
, debounce instead of throttle inputs. - In
tar_watch()
, add an action button to refresh the outputs. - Always deduplicate metadata after
tar_make()
. Will help compute a cache key on GitHub Actions and similar services. - Deprecate
tar_deduplicate()
due to the item above. - Reorder information in timestamped messages.
- Document RNG seed generation in
tar_target_raw()
,tar_meta()
, andtar_seed()
(#357, @alexpghayes). - Switch meaning of
%||%
and%|||%
to conform to historical precedent. - Only show a command line spinner if
reporter = "silent"
(#364, @matthiasgomolka). - Target and pipeline objects no longer have an
envir
element.
- In
tar_load()
, subset metadata to avoid accidental attempts to load global objects intidyselect
calls. - Do not register a pattern as running unless an actual branch is about to start (#304).
- Use a name spec in
vctrs::vec_c()
(#320, @joelnitta).
- Add a new
names
argument totar_objects()
andtar_workspaces()
withtidyselect
functionality. - Record info on the main process (PID, R version,
targets
version) in_targets/meta/process
and write new functionstar_process()
andtar_pid()
to retrieve the data (#291, #292). - Add a new
targets_only
argument totar_meta()
. - Add new functions
tar_helper()
andtar_helper_raw()
to write general-purpose R scripts, using tidy evaluation for as a template mechanism (#290, #291, #292, #306). - Export functions to check the existence of various pieces of local storage:
tar_exist_meta()
,tar_exist_objects()
,tar_exist_progress()
,tar_exist_progress()
,tar_exist_script()
(#310). - Add a new
supervise
argument totar_watch()
. - Add a new
complete_only
argument totar_meta()
to optionally return only complete rows (noNA
values). - Catch
callr
errors and refer users to the debugging chapter of the manual.
- Improve error messages of invalid arguments (#298, @brunocarlin). Removes partial argument matching in most cases.
- By default, locally enable
crayon
if an only if the calling process is interactive (#302, @ginolhac). Can still be disabled withoptions(crayon.enabled = FALSE)
in_targets.R
. - Improve error handling and message for
format = "url"
when the HTTP response status code is not 200 (#303, @petrbouchal). - Add more
extras
packages totar_renv()
(to supporttar_watch()
). - Show informative message instead of error in
tar_watch()
if_targets.R
does not exist. - Clear up the documentation of the
names
argument oftar_load()
(#314, @jameelalsalam). - Do not override
nobody
in customcurl
handles (#315, @riazarbi). - Rename "running" to "started" in the progress metadata. This avoids the implicit claim that
targets
is somehow actively monitoring each job, e.g. through a connection or heartbeat (#318). - Set
errormode = "warn"
ingetVDigest()
for files to work around eddelbuettel/digest#49 for network drives on Windows.targets
already runs those file checks anyway. (#316, @boshek). - If a package fails to load, print the library paths
targets
tried to load from.
tar_test()
now skips all tests on Solaris in order to fix the problems shown on the CRAN check page.- Enable
allow
andexclude
to work on imports intar_visnetwork()
andtar_glimpse()
. - Put
visNetwork
legends on right to avoid crowding the graph.
- Call
force()
on subpipeline objects to eliminate high-memory promises in target objects. Allows targets to be deployed to workers much faster whenretreival
is"main"
(#279).
- Add a new box to the
tar_watch()
app to tabulate progress on dynamic branches (#273, @mattwarkentin). - Store
type
,parent
, andbranches
in progress data fortar_watch()
(#273, @mattwarkentin). - Add a
fields
argument intar_progress()
and default to"progress"
for back compatibility (#273, @mattwarkentin). - Add a new
tar_progress_branches()
function to tabulate branch progress (#273, @mattwarkentin). - Add new "refresh" switch to
tar_watch()
to toggle automatic refreshing and force a refresh.
- Exclude
.Random.seed
by default intar_visnetwork()
. - Spelling: "cancelled" changed to "canceled".
- Enhance controls and use of space in the
tar_watch()
app. - Centralize internal path management utilities.
- Skip
clustermq
tests on Solaris.
- Avoid starting the description with the package name.
- Remove
if(FALSE)
blocks from help files to fix "unexecutable code" warnings (tar_glimpse()
,tar_visnetwork()
, andtar_watch()
). - Remove commented code in the examples (
tar_edit()
,tar_watch_ui()
, andtar_watch_server()
). - Ensure that all examples, tests, and vignettes do not write to the user's home file space. (Fixed an example of
tar_workspace()
.)
- Use JOSS paper in
CITATION
.
- Accept lists of target objects at the end of
_targets.R
(#253). - Deprecate
tar_pipeline()
andtar_bind()
because of the above (#253). - Always show a special message when the pipeline finishes (#258, @petrbouchal).
- Disable
visNetwork
stabilization (#264, @mattwarkentin). - Use default
visNetwork
font size. - Relay errors as condition messages if
error
is"continue"
(#267, @liutiming).
- Ensure pattern-only pipelines can be defined so they can be combined again later with
tar_bind()
(#245, @yonicd). - Implement safeguards around
igraph
topological sort.
- Topologically sort the rows of
tar_manifest()
(#263, @sctyner).
- Make patterns composable (#212, @glep, @djbirke).
- Allow workspaces to load nonexportable objects (#214).
- Make workspace files super light by saving only a reference to the required dependencies (#214).
- Add a new
workspaces
argument totar_option_set()
to specify which targets will save their workspace files duringtar_make()
(#214). - Change
error = "save"
toerror = "workspace"
to so it is clearer that saving workspaces no longer duplicates data (#214). - Rename
what
todestroy
intar_destroy()
. - Remove
tar_undebug()
because is redundant withtar_destroy(destroy = "workspaces")
.
- Make patterns composable (#212).
- Add new dynamic branching patterns
head()
,tail()
, andsample()
to provide functionality equivalent todrake
'smax_expand
(#56). - Add a new
tar_pattern()
function to emulate dynamic branching outside a pipeline. - Add a new
level_separation
argument totar_visnetwork()
andtar_glimpse()
to control the aspect ratio (#226). - Track functions from multiple packages with the
imports
argument totar_option_set()
(#239). - Add color for "built" progress if
outdated
isFALSE
intar_visnetwork()
. - Tweak colors in
tar_visnetwork()
to try to account for color blindness.
- Return full patterns from
tar_manifest()
. - Record package load errors in progress and metadata (#228, @psychelzh).
tar_renv()
now invokes_targets.R
through a background process just liketar_outdated()
etc. so it can account for more hidden packages (#224, @mattwarkentin).- Set
deployment
equal to"main"
for all targets intar_make()
. This ensurestar_make()
does not waste time waiting for nonexistent files to ship over a nonexistent network file system (NFS).tar_make_clustermq()
ortar_make_future()
could use NFS, so they still leavedeployment
alone.
- Add a new
size
field to the metadata to allowtargets
to make better judgments about when to rehash files (#180). We now compare hashes to check file size differences instead of doing messy floating point comparisons with ad hoc tolerances. It breaks back compatibility with old projects, but the error message is informative, and this is all still before the first official release. - Change "local" to "main" and "remote" to "worker" in the
storage
,retrieval
, anddeployment
settings (#183, @mattwarkentin). - Ensure function dependencies are sorted before computing the function hash (GitHub commit f15face7d72c15c2d1098da959492bdbfcddb425).
- Move
garbage_collection
to a target-level setting, i.e. argument totar_target()
andtar_option_set()
(#194). Previously was an argument to thetar_make*()
functions. - Allow
tar_name()
andtar_path()
to run outside the pipeline with debugging-friendly default return values.
- Stop sending target return values over the network when
storage
is"remote"
(#182, @mattwarkentin). - Shorten lengths of warnings and error messages to 128 characters (#186, @gorkang).
- Restrict in-memory metadata to avoid incorrectly recycling deleted targets (#191).
- Marshal nonexportable dependencies before sending them to workers. Transport data through
target$subpipeline
rather thantarget$cache
to make that happen (#209, @mattwarkentin).
- Add a new function
tar_bind()
to combine pipeline objects. - Add
tar_seed()
to get the random number generator seed of the target currently running.
- Allow target-specific
future::plan()
s through theresources
argument oftar_target()
(#198, @mattwarkentin). - Use
library()
instead ofrequire()
incommand_load_packages()
. - Evaluate commands directly in
targets$cache$targets$envir
to improve convenience in interactive debugging (ls()
just works now.) This is reasonably safe now that the cache is populated at the last minute and cleared as soon as possible (#209, #210).
- First version.