Skip to content

Commit

Permalink
feat!: replace $describe_plan() and $describe_optimized_plan() by…
Browse files Browse the repository at this point in the history
… `$explain()` (#1182)
  • Loading branch information
etiennebacher authored Aug 14, 2024
1 parent ce27974 commit 1b2daf6
Show file tree
Hide file tree
Showing 13 changed files with 343 additions and 133 deletions.
11 changes: 5 additions & 6 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,11 @@

## Polars R Package (development version)

### Breaking changes

- `$describe_plan()` and `$describe_optimized_plan()` are removed. Use
respectively `$explain(optimized = FALSE)` and `$explain()` instead (#1182).

### New features

- New method `$str$extract_many()` (#1163).
Expand All @@ -14,12 +19,6 @@
and replaced by `...`. This doesn't change the previous behavior, e.g.
`df$unnest(names = c("a", "b"))` still works (#1170).

### Bug fixes

- `$describe_plan()` and `$describe_optimized_plan()` are now consistent in their
output. Previously, the former would return a Result-type output and the other
would return nothing (as expected). They now both return nothing (#1175).

## Polars R Package 0.18.0

### Breaking changes
Expand Down
4 changes: 4 additions & 0 deletions R/extendr-wrappers.R
Original file line number Diff line number Diff line change
Expand Up @@ -1174,8 +1174,12 @@ RPolarsLazyFrame$print <- function() .Call(wrap__RPolarsLazyFrame__print, self)

RPolarsLazyFrame$describe_plan <- function() .Call(wrap__RPolarsLazyFrame__describe_plan, self)

RPolarsLazyFrame$describe_plan_tree <- function() .Call(wrap__RPolarsLazyFrame__describe_plan_tree, self)

RPolarsLazyFrame$describe_optimized_plan <- function() .Call(wrap__RPolarsLazyFrame__describe_optimized_plan, self)

RPolarsLazyFrame$describe_optimized_plan_tree <- function() .Call(wrap__RPolarsLazyFrame__describe_optimized_plan_tree, self)

RPolarsLazyFrame$debug_plan <- function() .Call(wrap__RPolarsLazyFrame__debug_plan, self)

RPolarsLazyFrame$collect <- function() .Call(wrap__RPolarsLazyFrame__collect, self)
Expand Down
106 changes: 81 additions & 25 deletions R/lazyframe__lazy.R
Original file line number Diff line number Diff line change
Expand Up @@ -86,14 +86,14 @@
#' Ldf_best = Ldf_best$filter(filter_expr)
#'
#' # the non optimized plans are similar, on entire in-mem csv, apply filter
#' Ldf_okay$describe_plan()
#' Ldf_best$describe_plan()
#' Ldf_okay$explain(optimized = FALSE)
#' Ldf_best$explain(optimized = FALSE)
#'
#' # NOTE For Ldf_okay, the full time to load csv alrady paid when creating Rdf and Pdf
#'
#' # The optimized plan are quite different, Ldf_best will read csv and perform filter simultaneously
#' Ldf_okay$describe_optimized_plan()
#' Ldf_best$describe_optimized_plan()
#' Ldf_okay$explain()
#' Ldf_best$explain()
#'
#'
#' # To acquire result in-mem use $colelct()
Expand Down Expand Up @@ -209,7 +209,7 @@ pl_LazyFrame = function(...) {
#' @examples pl$LazyFrame(iris)
print.RPolarsLazyFrame = function(x, ...) {
cat("polars LazyFrame\n")
cat(" $describe_optimized_plan() : Show the optimized query plan.\n")
cat(" $explain(): Show the optimized query plan.\n")
cat("\n")
cat("Naive plan:\n")
cloned_x = x$print()
Expand All @@ -232,39 +232,95 @@ LazyFrame_print = function() {
invisible(self)
}

#' Print the optimized or non-optimized plans of `LazyFrame`
#' Create a string representation of the query plan
#'
#' @rdname LazyFrame_describe_plan
#' The query plan is read from bottom to top. When `optimized = FALSE`, the
#' query as it was written by the user is shown. This is not what Polars runs.
#' Instead, it applies optimizations that are displayed by default by `$explain()`.
#' One classic example is the predicate pushdown, which applies the filter as
#' early as possible (i.e. at the bottom of the plan).
#'
#' @description `$describe_plan()` shows the query in the format that `polars`
#' understands. `$describe_optimized_plan()` shows the optimized query plan that
#' `polars` will execute when `$collect()` is called. It is possible that both
#' plans are identical if `polars` doesn't find any way to optimize the query.
#' @inheritParams LazyFrame_collect
#' @param format The format to use for displaying the logical plan. Must be either
#' `"plain"` (default) or `"tree"`.
#' @param optimized Return an optimized query plan. If `TRUE` (default), the
#' subsequent optimization flags control which optimizations run.
#' @inheritParams LazyFrame_set_optimization_toggle
#'
#' @return This only prints the plan in the console, it doesn't return any value.
#' @return A character value containing the query plan.
#' @examples
#' lazy_frame = pl$LazyFrame(iris)
#'
#' # Prepare your query
#' lazy_query = lazy_frame$sort("Species")$filter(pl$col("Species") != "setosa")
#'
#' # This is the query as `polars` understands it
#' lazy_query$describe_plan()
#' # This is the query that was written by the user, without any optimizations
#' # (use cat() for better printing)
#' lazy_query$explain(optimized = FALSE) |> cat()
#'
#' # This is the query after `polars` optimizes it: instead of sorting first and
#' # then filtering, it is faster to filter first and then sort the rest.
#' lazy_query$describe_optimized_plan()
LazyFrame_describe_optimized_plan = function() {
.pr$LazyFrame$describe_optimized_plan(self) |>
unwrap("in $describe_optimized_plan():")
invisible(NULL)
}
#' lazy_query$explain() |> cat()
#'
#' # Also possible to see this as tree format
#' lazy_query$explain(format = "tree") |> cat()
LazyFrame_explain = function(
...,
format = "plain",
optimized = TRUE,
type_coercion = TRUE,
predicate_pushdown = TRUE,
projection_pushdown = TRUE,
simplify_expression = TRUE,
slice_pushdown = TRUE,
comm_subplan_elim = TRUE,
comm_subexpr_elim = TRUE,
cluster_with_columns = TRUE,
streaming = FALSE) {
uw = \(res) unwrap(res, "in $explain():")

if (!is.character(format) || !format %in% c("plain", "tree")) {
Err_plain(r"(`format` must be one of `"plain"` or `"tree"`.)") |>
uw()
}

ldf = self

if (isTRUE(optimized)) {
ldf = ldf |>
.pr$LazyFrame$set_optimization_toggle(
type_coercion = type_coercion,
predicate_pushdown = predicate_pushdown,
projection_pushdown = projection_pushdown,
simplify_expression = simplify_expression,
slice_pushdown = slice_pushdown,
comm_subplan_elim = comm_subplan_elim,
comm_subexpr_elim = comm_subexpr_elim,
cluster_with_columns = cluster_with_columns,
streaming = streaming,
eager = FALSE
) |>
uw()

if (format == "tree") {
out = ldf |>
.pr$LazyFrame$describe_optimized_plan_tree()
} else {
out = ldf |>
.pr$LazyFrame$describe_optimized_plan()
}
} else {
if (format == "tree") {
out = ldf |>
.pr$LazyFrame$describe_plan_tree()
} else {
out = ldf |>
.pr$LazyFrame$describe_plan()
}
}

#' @rdname LazyFrame_describe_plan
LazyFrame_describe_plan = function() {
.pr$LazyFrame$describe_plan(self) |>
unwrap("in $describe_plan():")
invisible(NULL)
out |>
uw()
}


Expand Down
8 changes: 4 additions & 4 deletions man/LazyFrame_class.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

33 changes: 0 additions & 33 deletions man/LazyFrame_describe_plan.Rd

This file was deleted.

84 changes: 84 additions & 0 deletions man/LazyFrame_explain.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 11 additions & 0 deletions src/rust/src/lazy/dataframe.rs
Original file line number Diff line number Diff line change
Expand Up @@ -49,13 +49,24 @@ impl RPolarsLazyFrame {
Ok(self.0.describe_plan().map_err(polars_to_rpolars_err)?)
}

fn describe_plan_tree(&self) -> RResult<String> {
Ok(self.0.describe_plan_tree().map_err(polars_to_rpolars_err)?)
}

pub fn describe_optimized_plan(&self) -> RResult<String> {
Ok(self
.0
.describe_optimized_plan()
.map_err(polars_to_rpolars_err)?)
}

fn describe_optimized_plan_tree(&self) -> RResult<String> {
Ok(self
.0
.describe_optimized_plan_tree()
.map_err(polars_to_rpolars_err)?)
}

//low level version of describe_plan, mainly for arg testing
pub fn debug_plan(&self) -> Result<String, String> {
use polars_core::export::serde::Serialize;
Expand Down
Loading

0 comments on commit 1b2daf6

Please sign in to comment.