Skip to content

Commit

Permalink
Suppress autoprint during := assignment on subclasses of data.table (#…
Browse files Browse the repository at this point in the history
…6631)

* Respect shouldPrint when auto-printing from knitr

Implementing a method for the knitr::knit_print generic makes it
possible to customise the behaviour without looking up the call stack.

The current solution only works on R >= 3.6.0 because that's where
delayed S3 registration has been introduced.

* Delay S3method(knit_print, data.table) for R < 3.6

Use setHook() to ensure that registerS3method() will be called in the
same session if 'knitr' is loaded later. Not needed on R >= 3.6.0 where
S3method(knitr::knit_print) will do the right thing by itself.

* ws-only style

* put setHook() in a branch

* Position comment on the same line

* Restore the still-required #2369 condition

* Use identical(,print) to check for autoprint

* Add tests, including updating broken test

* add comment

* test withAutoprint() behavior too

* NEWS entry

* blank line in correct place?

---------

Co-authored-by: Ivan K <[email protected]>
  • Loading branch information
MichaelChirico and aitap authored Dec 9, 2024
1 parent 96e89fa commit 17a7c3e
Show file tree
Hide file tree
Showing 4 changed files with 80 additions and 16 deletions.
4 changes: 3 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,9 @@ rowwiseDT(
14. Added a `data.frame` method for `format_list_item()` to fix error printing data.tables with columns containing 1-column data.frames, [#6592](https://github.com/Rdatatable/data.table/issues/6592). Thanks to @r2evans for the bug report and fix.
15. The auto-printing suppression in `knitr` documents is now done by implementing a method for `knit_print` instead of looking up the call stack, [#6589](https://github.com/Rdatatable/data.table/pull/6589). Thanks to @jangorecki for the report [#6509](https://github.com/Rdatatable/data.table/issues/6509) and @aitap for the fix.
15. Auto-printing gets some substantial improvements
- Suppression in `knitr` documents is now done by implementing a method for `knit_print` instead of looking up the call stack, [#6589](https://github.com/Rdatatable/data.table/pull/6589). The old way was fragile and wound up broken by some implementation changes in {knitr}. Thanks to @jangorecki for the report [#6509](https://github.com/Rdatatable/data.table/issues/6509) and @aitap for the fix.
- `print()` methods for S3 subclasses of data.table (e.g. an object of class `c("my.table", "data.table", "data.frame")`) no longer print where plain data.tables wouldn't, e.g. `myDT[, y := 2]`, [#3029](https://github.com/Rdatatable/data.table/issues/3029). The improved detection of auto-printing scenarios has the added benefit of _allowing_ print in highly explicit statements like `print(DT[, y := 2])`, obviating our recommendation since v1.9.6 to append `[]` to signal "please print me".

16. Joins of `integer64` and `double` columns succeed when the `double` column has lossless `integer64` representation, [#4167](https://github.com/Rdatatable/data.table/issues/4167) and [#6625](https://github.com/Rdatatable/data.table/issues/6625). Previously, this only worked when the double column had lossless _32-bit_ integer representation. Thanks @MichaelChirico for the reports and fix.

Expand Down
2 changes: 1 addition & 1 deletion R/print.data.table.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ print.data.table = function(x, topn=getOption("datatable.print.topn"),
# Other options investigated (could revisit): Cstack_info(), .Last.value gets set first before autoprint, history(), sys.status(),
# topenv(), inspecting next statement in caller, using clock() at C level to timeout suppression after some number of cycles
SYS = sys.calls()
if (length(SYS) <= 2L || # "> DT" auto-print or "> print(DT)" explicit print (cannot distinguish from R 3.2.0 but that's ok)
if (identical(SYS[[1L]][[1L]], print) || # this is what auto-print looks like, i.e. '> DT' and '> DT[, a:=b]' in the terminal; see #3029.
( length(SYS) >= 3L && is.symbol(thisSYS <- SYS[[length(SYS)-2L]][[1L]]) &&
as.character(thisSYS) == 'source') ) { # suppress printing from source(echo = TRUE) calls, #2369
return(invisible(x))
Expand Down
28 changes: 23 additions & 5 deletions tests/autoprint.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ DT[FALSE,a:=3L] # no
DT[a==4L,a:=5L] # no
DT[a %in% 4:8, a:=5L] # no
DT # yes
print(DT[2,a:=4L]) # no
print(DT[2,a:=4L]) # yes, as of #6631
print(DT) # yes
if (TRUE) DT[2,a:=5L] # no. used to print before v1.9.5
if (TRUE) if (TRUE) DT[2,a:=6L] # no. used to print before v1.9.5
Expand All @@ -20,7 +20,7 @@ DT # yes. 2nd time needed, or solutions below
(function(){DT[2,a:=5L];NULL})() # print NULL
DT[] # yes. guaranteed print
(function(){DT[2,a:=5L];NULL})() # print NULL
print(DT) # no. only DT[] is guaranteed print from v1.9.6 and R 3.2.0
print(DT) # yes. restored in #6631 behavior that had changed in 1.9.6.
(function(){DT[2,a:=5L][];NULL})() # print NULL
DT # yes. i) function needs to add [] after last one, so that "DT" alone is guaranteed anyway
(function(){DT[2,a:=5L];DT[];NULL})() # print NULL
Expand All @@ -29,9 +29,9 @@ DT2 = data.table(b=3:4) # no
(function(){DT[2,a:=6L];DT2[1,b:=7L];NULL})()
DT # yes. last := was on DT2 not DT
{DT[2,a:=6L];invisible()} # no
print(DT) # no
print(DT) # yes
(function(){print(DT[2,a:=7L]);print(DT);invisible()})() # yes*2
{print(DT[2,a:=8L]);print(DT);invisible()} # yes*1 Not within function so as at prompt
{print(DT[2,a:=8L]);print(DT);invisible()} # yes*2 as at prompt, again as of #6631
DT[1][,a:=9L] # no (was too tricky to detect that DT[1] is a new object). Simple rule is that := always doesn't print
DT[2,a:=10L][1] # yes
DT[1,a:=10L][1,a:=10L] # no
Expand All @@ -51,7 +51,25 @@ local({
writeLines(c(
"library(data.table)",
"DT = data.table(a = 1)",
"DT[,a:=1] # not auto-printed"
"DT[,a:=1]" # no
), f)
source(f, local = TRUE, echo = TRUE)
})

# child class of data.table doesn't induce unintended print, #3029
dt <- data.table(x = 1)
class(dt) <- c("foo", "data.table", "data.frame")
print.foo <- function(x, ...) {
NextMethod("print")
}
dt[, y := 1] # no

# withAutoprint() testing (since R3.4.0)
if (!exists("withAutoprint", baseenv())) {
q("no")
}
if (TRUE) withAutoprint({
DT # yes
DT[1L, 1L] # yes
DT[2L, a := 10L] # no
})
62 changes: 53 additions & 9 deletions tests/autoprint.Rout.save
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@

R version 4.3.2 (2023-10-31) -- "Eye Holes"
Copyright (C) 2023 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)
R Under development (unstable) (2024-12-01 r87412) -- "Unsuffered Consequences"
Copyright (C) 2024 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Expand Down Expand Up @@ -44,7 +44,11 @@ Index: <a>
<int>
1: 1
2: 3
> print(DT[2,a:=4L]) # no
> print(DT[2,a:=4L]) # yes, as of #6631
a
<int>
1: 1
2: 4
> print(DT) # yes
a
<int>
Expand All @@ -69,7 +73,11 @@ NULL
2: 5
> (function(){DT[2,a:=5L];NULL})() # print NULL
NULL
> print(DT) # no. only DT[] is guaranteed print from v1.9.6 and R 3.2.0
> print(DT) # yes. restored in #6631 behavior that had changed in 1.9.6.
a
<int>
1: 1
2: 5
> (function(){DT[2,a:=5L][];NULL})() # print NULL
NULL
> DT # yes. i) function needs to add [] after last one, so that "DT" alone is guaranteed anyway
Expand All @@ -93,7 +101,11 @@ NULL
1: 1
2: 6
> {DT[2,a:=6L];invisible()} # no
> print(DT) # no
> print(DT) # yes
a
<int>
1: 1
2: 6
> (function(){print(DT[2,a:=7L]);print(DT);invisible()})() # yes*2
a
<int>
Expand All @@ -103,7 +115,11 @@ NULL
<int>
1: 1
2: 7
> {print(DT[2,a:=8L]);print(DT);invisible()} # yes*1 Not within function so as at prompt
> {print(DT[2,a:=8L]);print(DT);invisible()} # yes*2 as at prompt, again as of #6631
a
<int>
1: 1
2: 8
a
<int>
1: 1
Expand Down Expand Up @@ -143,7 +159,7 @@ NULL
+ writeLines(c(
+ "library(data.table)",
+ "DT = data.table(a = 1)",
+ "DT[,a:=1] # not auto-printed"
+ "DT[,a:=1]" # no
+ ), f)
+ source(f, local = TRUE, echo = TRUE)
+ })
Expand All @@ -154,6 +170,34 @@ NULL

> DT[, `:=`(a, 1)]
>
> # child class of data.table doesn't induce unintended print, #3029
> dt <- data.table(x = 1)
> class(dt) <- c("foo", "data.table", "data.frame")
> print.foo <- function(x, ...) {
+ NextMethod("print")
+ }
> dt[, y := 1] # no
>
> # withAutoprint() testing (since R3.4.0)
> if (!exists("withAutoprint", baseenv())) {
+ q("no")
+ }
> if (TRUE) withAutoprint({
+ DT # yes
+ DT[1L, 1L] # yes
+ DT[2L, a := 10L] # no
+ })
> DT
a
<int>
1: 10
2: 10
> DT[1L, 1L]
a
<int>
1: 10
> DT[2L, `:=`(a, 10L)]
>
> proc.time()
user system elapsed
0.223 0.016 0.231
0.182 0.056 0.246

0 comments on commit 17a7c3e

Please sign in to comment.