Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to the cache API required by the new reticulate cache implementation #2170

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

leogama
Copy link

@leogama leogama commented Sep 12, 2022

As we discussed in rstudio/reticulate#1210 (comment), these changes use the same interface for the cache of external engines as the R cache interface. Relevant changes for non-R chunks:

  • cache_engines stores cache importer functions that, upon call, return "cache objects", i.e. lists with functions as items (exists, load, save, purge, etc.)
  • external cache functions are called in the knit() call's working directory
  • cache checks (exists()) and purges are done for R cache and external cache together for consistency

Plus a bug fix: always check for .RData files (which stores the output in lazy cache mode if I understood it right).


@yihui You may review this PR right away, but let's hold the merge while I'm still implementing and testing the reticulate PR.

@CLAassistant
Copy link

CLAassistant commented Sep 12, 2022

CLA assistant check
All committers have signed the CLA.

@leogama
Copy link
Author

leogama commented Sep 13, 2022

The last commit should work with rstudio/reticulate@5d6f7a7 and uqfoundation/dill@71b4f77

@leogama
Copy link
Author

leogama commented Sep 14, 2022

Updated: The last commit should work with rstudio/reticulate@a33ed39 and uqfoundation/dill@4244d3b

Copy link
Owner

@yihui yihui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good over all. I'll make a few cosmetic changes and merge later. Thank you!

R/cache.R Outdated
cache_purge = function(hash) {
for (h in hash) unlink(paste(cache_path(h), c('rdb', 'rdx', 'RData'), sep = '.'))
cache_purge = function(glob_path) {
unlink(paste(glob_path, c('rdb', 'rdx', 'RData'), sep = '.'))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that glob_path could be a vector of length > 1, so we still need to purge them one by one via a loop.

@yihui
Copy link
Owner

yihui commented Sep 27, 2022

Oh actually since the API has changed, we need to update the documentation accordingly:

knitr/R/engine.R

Lines 35 to 51 in 6bfffe9

#' Cache engines of other languages
#'
#' This object controls how to load cached environments from languages other
#' than R (when the chunk option \code{engine} is not \code{'R'}). Each
#' component in this object is a function that takes the current path to the
#' chunk cache and loads it into the language environment.
#'
#' The cache engine function has one argument \code{options}, a list containing
#' all chunk options. Note that \code{options$hash} is the path to the current
#' chunk cache with the chunk's hash, but without any file extension, and the
#' language engine may write a cache database to this path (with an extension).
#'
#' The cache engine function should load the cache environment and should know
#' the extension appropriate for the language.
#' @references See \url{https://github.com/rstudio/reticulate/pull/167} for an
#' implementation of a cache engine for Python.
#' @export

Could you please also do that? Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants