Add translation to comparison expressions #270

toppyy · 2024-10-06T07:20:26Z

Related to #172

Sketch of using introducing the use comparison expressions in duckplyr. See related PR in duckdb-r duckdb/duckdb-r#457

krlmlr · 2024-10-19T18:16:18Z

duckdb 1.1.1 is on CRAN now, and has the necessary additions. Would you like to take another stab?

Can you please style the code with styler? (Good after chore: Apply styler #281, needs Adapt to single indent semantics in style guide r-lib/styler#1235 to avoid noise)
map_xxx() or map() are better than sapply()
When comparing types, should we use class() ?

toppyy · 2024-10-20T06:33:05Z

Thanks! I'll take another stab with these comments in mind.

krlmlr · 2024-11-03T20:47:09Z

Thanks. At some point, I can take over here, or we can do another round. Does the current implementation achieve the desired pushdown to Parquet?

toppyy · 2024-11-04T06:05:43Z

I can continue working on it if you can provide pointers? But I don't mind you making commits either. Thanks.

The implementation pushes down filters to parquet. Below is reprex of the code given in the issue.

A comparison expression is currently created only for object that share the same class or if it's a case of integer vs. numeric. I believe class is a better point of comparison than types, but I had some issues with objects that have multiple classes (i.e class(ISOdate(2010,1,1,0))). Decided to take the first class listed for comparison.

library(duckdb)
#> Loading required package: DBI
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(duckplyr)
#> The duckplyr package is configured to fall back to dplyr when it encounters an
#> incompatibility. Fallback events can be collected and uploaded for analysis to
#> guide future development. By default, no data will be collected or uploaded.
#> → Run `duckplyr::fallback_sitrep()` to review the current settings.
#> ✔ Overwriting dplyr methods with duckplyr methods.
#> ℹ Turn off with `duckplyr::methods_restore()`.
#> 
#> Attaching package: 'duckplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union


con_dp <- duckplyr:::get_default_duckdb_connection()
dbSendQuery(con_dp, "INSTALL httpfs; LOAD httpfs;")
#> <duckdb_result ed430 connection=a8ab0 statement='INSTALL httpfs; LOAD httpfs;'>
dbSendQuery(con_dp, "SET s3_region='auto';SET s3_endpoint='';")
#> <duckdb_result 25a40 connection=a8ab0 statement='SET s3_region='auto';SET s3_endpoint='';'>


f_duckplyr <- function() {
  duckplyr::duckplyr_df_from_file(
    "s3://duckplyr-demo-taxi-data/taxi-data-2019-partitioned/*/*.parquet",
    "read_parquet",
    options = list(hive_partitioning = TRUE),
    class = class(tibble())
  ) |>
    filter(total_amount > 0) |>
    filter(!is.na(passenger_count)) |>
    mutate(tip_pct = 100 * tip_amount / total_amount) |>
    summarise(
      avg_tip_pct = median(tip_pct),
      n = n(),
      .by = passenger_count
    ) |>
    arrange(desc(passenger_count))
}

explain(f_duckplyr())
#> ┌───────────────────────────┐
#> │          ORDER_BY         │
#> │    ────────────────────   │
#> │        read_parquet       │
#> │   .passenger_count DESC   │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │         PROJECTION        │
#> │    ────────────────────   │
#> │      passenger_count      │
#> │        avg_tip_pct        │
#> │             n             │
#> │                           │
#> │       ~1762348 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │       HASH_GROUP_BY       │
#> │    ────────────────────   │
#> │         Groups: #0        │
#> │                           │
#> │        Aggregates:        │
#> │         median(#1)        │
#> │        count_star()       │
#> │                           │
#> │       ~1762348 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │         PROJECTION        │
#> │    ────────────────────   │
#> │      passenger_count      │
#> │          tip_pct          │
#> │                           │
#> │       ~3524697 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │         PROJECTION        │
#> │    ────────────────────   │
#> │      passenger_count      │
#> │          tip_pct          │
#> │                           │
#> │       ~3524697 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │         PROJECTION        │
#> │    ────────────────────   │
#> │      passenger_count      │
#> │         tip_amount        │
#> │        total_amount       │
#> │                           │
#> │       ~3524697 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │           FILTER          │
#> │    ────────────────────   │
#> │ (NOT ((passenger_count IS │
#> │     NULL) OR isnan(CAST   │
#> │ (passenger_count AS DOUBLE│
#> │            ))))           │
#> │                           │
#> │       ~3524697 Rows       │
#> └─────────────┬─────────────┘
#> ┌─────────────┴─────────────┐
#> │       READ_PARQUET        │
#> │    ────────────────────   │
#> │         Function:         │
#> │        READ_PARQUET       │
#> │                           │
#> │        Projections:       │
#> │        total_amount       │
#> │      passenger_count      │
#> │         tip_amount        │
#> │                           │
#> │          Filters:         │
#> │    total_amount>0.0 AND   │
#> │  total_amount IS NOT NULL │
#> │                           │
#> │       ~17623488 Rows      │
#> └───────────────────────────┘

krlmlr · 2024-11-06T11:46:38Z

Let's see if this has an effect on the continuous benchmarks.

github-actions · 2024-11-06T13:20:59Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if bfd1055 is merged into main:

✔️001_tpch_01: 26.7ms -> 27.2ms [-1.83%, +5.69%]
✔️001_tpch_02: 144ms -> 144ms [-2.14%, +2.46%]
✔️001_tpch_03: 75.2ms -> 73.9ms [-3.79%, +0.38%]
✔️001_tpch_04: 25.7ms -> 25.2ms [-4.17%, +0.23%]
✔️001_tpch_05: 135ms -> 136ms [-1.5%, +1.97%]
🚀001_tpch_06: 16.4ms -> 15.6ms [-8.16%, -1.52%]
✔️001_tpch_07: 155ms -> 155ms [-1.23%, +0.98%]
✔️001_tpch_08: 174ms -> 175ms [-0.62%, +1.55%]
✔️001_tpch_09: 145ms -> 145ms [-1.26%, +1.55%]
✔️001_tpch_10: 84.3ms -> 84.5ms [-1.57%, +2.06%]
✔️001_tpch_11: 75.8ms -> 76.3ms [-1.61%, +2.8%]
✔️001_tpch_12: 69.8ms -> 70.8ms [-0.55%, +3.29%]
❗🐌001_tpch_13: 22ms -> 22.4ms [+0.07%, +3.22%]
✔️001_tpch_14: 23.8ms -> 23.5ms [-3.13%, +0.61%]
✔️001_tpch_15: 77.1ms -> 77.4ms [-1.85%, +2.51%]
✔️001_tpch_16: 81.2ms -> 80.1ms [-3.32%, +0.6%]
✔️001_tpch_17: 30.9ms -> 31.1ms [-1.29%, +2.49%]
✔️001_tpch_18: 28.2ms -> 28.9ms [-3.26%, +7.97%]
✔️001_tpch_19: 157ms -> 156ms [-3.9%, +3.2%]
✔️001_tpch_20: 92ms -> 93.2ms [-0.4%, +2.8%]
✔️001_tpch_21: 211ms -> 212ms [-1.33%, +2.25%]
✔️001_tpch_22: 150ms -> 152ms [-0.46%, +2.07%]
✔️010_tpch_01: 83.5ms -> 82.3ms [-4.56%, +1.8%]
✔️010_tpch_02: 74.3ms -> 74.3ms [-1.55%, +1.43%]
🚀010_tpch_03: 68.8ms -> 65.9ms [-6.25%, -2.09%]
🚀010_tpch_04: 51.6ms -> 47ms [-11.74%, -6.11%]
✔️010_tpch_05: 98.2ms -> 99.1ms [-2.93%, +4.71%]
✔️010_tpch_06: 37.8ms -> 36.2ms [-10.51%, +2.17%]
✔️010_tpch_07: 120ms -> 119ms [-2.81%, +0.11%]
🚀010_tpch_08: 146ms -> 141ms [-5.84%, -0.08%]
✔️010_tpch_09: 125ms -> 124ms [-2.61%, +0.75%]
✔️010_tpch_10: 89.8ms -> 91.4ms [-2.68%, +6.09%]
✔️010_tpch_11: 42.4ms -> 41.5ms [-4.89%, +0.76%]
✔️010_tpch_12: 65.4ms -> 64.2ms [-5.07%, +1.25%]
✔️010_tpch_13: 56ms -> 56.5ms [-1.34%, +3.12%]
🚀010_tpch_14: 43.9ms -> 42.7ms [-4.33%, -1.12%]
🚀010_tpch_15: 62.2ms -> 56.9ms [-15.31%, -1.73%]
✔️010_tpch_16: 49ms -> 48.6ms [-4.83%, +3.16%]
✔️010_tpch_17: 61ms -> 60.6ms [-7.44%, +6.24%]
✔️010_tpch_18: 58.5ms -> 63.2ms [-0.99%, +17.23%]
✔️010_tpch_19: 128ms -> 127ms [-2.46%, +1.03%]
🚀010_tpch_20: 77.5ms -> 74.8ms [-6.3%, -0.68%]
🚀010_tpch_21: 482ms -> 442ms [-10.74%, -5.85%]
✔️010_tpch_22: 84.8ms -> 84.2ms [-3.21%, +1.93%]
✔️100_tpch_01: 1.32s -> 1.35s [-6.42%, +10.29%]
✔️100_tpch_02: 126ms -> 125ms [-1.94%, +0.93%]
✔️100_tpch_03: 1.27s -> 1.25s [-5%, +1.25%]
✔️100_tpch_04: 1.23s -> 1.23s [-4.96%, +5.02%]
✔️100_tpch_05: 1.33s -> 1.31s [-4.54%, +2.5%]
✔️100_tpch_06: 1.15s -> 1.12s [-10.12%, +4.93%]
✔️100_tpch_07: 1.26s -> 1.26s [-2.7%, +2.33%]
✔️100_tpch_08: 1.28s -> 1.29s [-5.95%, +7.53%]
✔️100_tpch_09: 1.36s -> 1.35s [-5.42%, +4.84%]
✔️100_tpch_10: 1.28s -> 1.26s [-8.04%, +5.26%]
✔️100_tpch_11: 83.6ms -> 92.8ms [-4.37%, +26.38%]
✔️100_tpch_12: 1.21s -> 1.23s [-2.68%, +5.13%]
✔️100_tpch_13: 347ms -> 351ms [-6.23%, +8.25%]
✔️100_tpch_14: 1.17s -> 1.17s [-3.34%, +4.17%]
✔️100_tpch_15: 1.26s -> 1.27s [-2.5%, +2.72%]
✔️100_tpch_16: 134ms -> 126ms [-25.36%, +13.35%]
❗🐌100_tpch_17: 1.17s -> 1.19s [+0.11%, +3.04%]
✔️100_tpch_18: 1.26s -> 1.26s [-3.07%, +3.4%]
✔️100_tpch_19: 1.39s -> 1.37s [-7.69%, +5.25%]
✔️100_tpch_20: 1.26s -> 1.22s [-8.82%, +2.27%]
✔️100_tpch_21: 2.43s -> 2.41s [-7.78%, +6.01%]
✔️100_tpch_22: 195ms -> 185ms [-15.31%, +4.36%]

Further explanation regarding interpretation and methodology can be found in the documentation.

krlmlr · 2024-11-06T17:24:53Z

I'm somewhat surprised to see that the timings aren't improving for the sf=1 runs (100_*). I'll work on making the plan available as JSON so that we can compare before-after at the level of the plan.

github-actions · 2024-11-06T18:46:02Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if e8a5627 is merged into main:

❗🐌001_tpch_01: 23.7ms -> 24.7ms [+0.48%, +7.86%]
✔️001_tpch_02: 128ms -> 129ms [-0.54%, +1.13%]
✔️001_tpch_03: 66.6ms -> 66.2ms [-1.76%, +0.53%]
🚀001_tpch_04: 24.5ms -> 23.5ms [-5.25%, -2.88%]
✔️001_tpch_05: 125ms -> 125ms [-1.52%, +1.26%]
🚀001_tpch_06: 15.2ms -> 14.7ms [-4.96%, -1.13%]
✔️001_tpch_07: 144ms -> 144ms [-1.38%, +2.33%]
✔️001_tpch_08: 162ms -> 162ms [-0.89%, +0.85%]
✔️001_tpch_09: 133ms -> 133ms [-2.02%, +1.29%]
✔️001_tpch_10: 77.5ms -> 78.8ms [-0.36%, +3.84%]
✔️001_tpch_11: 65.5ms -> 65.6ms [-1.99%, +2.08%]
❗🐌001_tpch_12: 29.3ms -> 60ms [+101.97%, +106.99%]
✔️001_tpch_13: 20.3ms -> 20.6ms [-0.41%, +3.97%]
🚀001_tpch_14: 21.9ms -> 21.5ms [-3.47%, -0.39%]
✔️001_tpch_15: 64.6ms -> 64.1ms [-2.2%, +0.79%]
✔️001_tpch_16: 65.2ms -> 65.8ms [-0.02%, +1.98%]
✔️001_tpch_17: 27.8ms -> 28.1ms [-0.56%, +3.1%]
✔️001_tpch_18: 23.5ms -> 23.5ms [-1.36%, +1.69%]
✔️001_tpch_19: 132ms -> 131ms [-2.08%, 0%]
✔️001_tpch_20: 79.7ms -> 79.4ms [-2.09%, +1.49%]
✔️001_tpch_21: 176ms -> 177ms [-0.43%, +1.47%]
✔️001_tpch_22: 130ms -> 129ms [-2.08%, +0.51%]
✔️010_tpch_01: 83.6ms -> 81.9ms [-6.93%, +2.85%]
✔️010_tpch_02: 71.4ms -> 71.9ms [-0.55%, +1.74%]
🚀010_tpch_03: 65.7ms -> 64.1ms [-4.19%, -0.79%]
🚀010_tpch_04: 49.9ms -> 45.7ms [-12.02%, -4.64%]
🚀010_tpch_05: 95.4ms -> 93.4ms [-3.28%, -0.86%]
🚀010_tpch_06: 37.1ms -> 33.5ms [-14.83%, -4.34%]
🚀010_tpch_07: 114ms -> 111ms [-3.75%, -0.97%]
✔️010_tpch_08: 134ms -> 132ms [-2.55%, +0.12%]
✔️010_tpch_09: 118ms -> 119ms [-1.1%, +2.98%]
✔️010_tpch_10: 84.6ms -> 84.2ms [-2.32%, +1.35%]
✔️010_tpch_11: 38.3ms -> 38.3ms [-3.25%, +3.44%]
🚀010_tpch_12: 63.9ms -> 60.3ms [-8.38%, -2.81%]
✔️010_tpch_13: 52.7ms -> 52.4ms [-1.62%, +0.22%]
🚀010_tpch_14: 42.2ms -> 39.6ms [-7.34%, -5.19%]
🚀010_tpch_15: 59.4ms -> 55.8ms [-11.1%, -0.89%]
✔️010_tpch_16: 45.5ms -> 45.2ms [-3.2%, +1.82%]
✔️010_tpch_17: 57.5ms -> 55.6ms [-9.84%, +3.44%]
✔️010_tpch_18: 55.5ms -> 54.6ms [-7.86%, +4.55%]
✔️010_tpch_19: 124ms -> 123ms [-1.88%, +0.46%]
✔️010_tpch_20: 73.2ms -> 70.3ms [-8.05%, +0.18%]
🚀010_tpch_21: 420ms -> 380ms [-11%, -7.97%]
✔️010_tpch_22: 77.5ms -> 77.7ms [-1.53%, +1.98%]
✔️100_tpch_01: 1.22s -> 1.21s [-2.58%, +1.52%]
✔️100_tpch_02: 122ms -> 122ms [-3.03%, +2.07%]
🚀100_tpch_03: 1.15s -> 1.1s [-7.28%, -0.02%]
✔️100_tpch_04: 1.1s -> 1.09s [-5.37%, +3.25%]
✔️100_tpch_05: 1.23s -> 1.2s [-6.71%, +1.49%]
✔️100_tpch_06: 1.02s -> 1.02s [-4.24%, +3.94%]
✔️100_tpch_07: 1.16s -> 1.15s [-4.97%, +3.51%]
✔️100_tpch_08: 1.17s -> 1.18s [-1.78%, +2.97%]
✔️100_tpch_09: 1.27s -> 1.29s [-2.29%, +5.32%]
✔️100_tpch_10: 1.15s -> 1.15s [-1.95%, +2.62%]
✔️100_tpch_11: 80.7ms -> 89.2ms [-17.06%, +37.89%]
🚀100_tpch_12: 1.13s -> 1.11s [-3.73%, -0.55%]
✔️100_tpch_13: 334ms -> 310ms [-19.05%, +4.68%]
✔️100_tpch_14: 1.05s -> 1.05s [-0.96%, +1.42%]
🚀100_tpch_15: 1.15s -> 1.12s [-3.99%, -0.85%]
✔️100_tpch_16: 128ms -> 123ms [-12.52%, +4.31%]
🚀100_tpch_17: 1.12s -> 1.09s [-4.23%, -0.21%]
✔️100_tpch_18: 1.12s -> 1.13s [-1.23%, +2.36%]
✔️100_tpch_19: 1.21s -> 1.21s [-0.28%, +0.5%]
🚀100_tpch_20: 1.11s -> 1.08s [-3.95%, -0.86%]
✔️100_tpch_21: 2.29s -> 2.22s [-6.94%, +0.75%]
✔️100_tpch_22: 175ms -> 179ms [-4.51%, +9.39%]

Further explanation regarding interpretation and methodology can be found in the documentation.

toppyy · 2024-11-06T19:51:10Z

Any filters using equality operators were not pushed down due to a rather obvious bug I had missed. I pushed a commit fixing it.

The plans as JSON might come in handy anyway.

github-actions · 2024-11-06T21:10:11Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 9c7d533 is merged into main:

✔️001_tpch_01: 23.9ms -> 24.1ms [-2.88%, +4.74%]
✔️001_tpch_02: 132ms -> 131ms [-2.21%, +0.75%]
✔️001_tpch_03: 69ms -> 67.8ms [-3.54%, +0.16%]
🚀001_tpch_04: 24.3ms -> 23.3ms [-6.44%, -1.79%]
✔️001_tpch_05: 125ms -> 124ms [-1.69%, +1.01%]
🚀001_tpch_06: 15.5ms -> 14.9ms [-6.49%, -0.31%]
🚀001_tpch_07: 144ms -> 142ms [-2.3%, -0.28%]
🚀001_tpch_08: 164ms -> 161ms [-3.77%, -0.53%]
✔️001_tpch_09: 135ms -> 134ms [-1.5%, +0.88%]
✔️001_tpch_10: 78.3ms -> 77.2ms [-3.05%, +0.29%]
✔️001_tpch_11: 65.3ms -> 65.3ms [-1.84%, +1.85%]
❗🐌001_tpch_12: 29ms -> 60.5ms [+104.84%, +112.47%]
✔️001_tpch_13: 19.9ms -> 20.1ms [-0.56%, +2.31%]
🚀001_tpch_14: 21.6ms -> 21ms [-5.34%, -0.19%]
✔️001_tpch_15: 65.3ms -> 64.5ms [-3.41%, +0.67%]
✔️001_tpch_16: 67.5ms -> 68.1ms [-1.13%, +3.14%]
✔️001_tpch_17: 27.4ms -> 27.1ms [-2.92%, +0.99%]
✔️001_tpch_18: 23.5ms -> 23.9ms [-1.28%, +4.87%]
✔️001_tpch_19: 134ms -> 134ms [-1.39%, +1.8%]
🚀001_tpch_20: 81.5ms -> 79.3ms [-4.42%, -1.15%]
✔️001_tpch_21: 180ms -> 177ms [-2.78%, +0.08%]
✔️001_tpch_22: 130ms -> 131ms [-1.08%, +2.22%]
✔️010_tpch_01: 83.4ms -> 85.4ms [-7.35%, +12.27%]
✔️010_tpch_02: 71.6ms -> 71.1ms [-1.93%, +0.65%]
🚀010_tpch_03: 66.2ms -> 63ms [-6.45%, -3.16%]
🚀010_tpch_04: 51.3ms -> 45.8ms [-16.78%, -4.87%]
🚀010_tpch_05: 94.8ms -> 93.3ms [-2.91%, -0.05%]
✔️010_tpch_06: 35.8ms -> 34.1ms [-11.95%, +2.14%]
🚀010_tpch_07: 115ms -> 110ms [-6.95%, -1.63%]
✔️010_tpch_08: 134ms -> 133ms [-2.46%, +1.34%]
✔️010_tpch_09: 118ms -> 117ms [-1.57%, +0.62%]
🚀010_tpch_10: 85.1ms -> 78.6ms [-11.76%, -3.51%]
✔️010_tpch_11: 39.4ms -> 39.1ms [-6.43%, +5.12%]
🚀010_tpch_12: 63.3ms -> 60.4ms [-8.42%, -0.76%]
✔️010_tpch_13: 52.2ms -> 54ms [-0.36%, +7.23%]
🚀010_tpch_14: 43.1ms -> 39.2ms [-13.33%, -4.54%]
✔️010_tpch_15: 59.2ms -> 56.5ms [-11.11%, +1.92%]
✔️010_tpch_16: 45.8ms -> 45.4ms [-4.14%, +2.64%]
🚀010_tpch_17: 57.1ms -> 54.7ms [-7.42%, -1%]
✔️010_tpch_18: 56.2ms -> 55.7ms [-6.5%, +4.7%]
🚀010_tpch_19: 124ms -> 118ms [-6.54%, -3.74%]
✔️010_tpch_20: 72.7ms -> 70.4ms [-6.29%, +0.01%]
🚀010_tpch_21: 422ms -> 384ms [-10.49%, -7.49%]
✔️010_tpch_22: 77.8ms -> 78.1ms [-3.13%, +3.95%]
✔️100_tpch_01: 1.22s -> 1.22s [-2.15%, +2.82%]
✔️100_tpch_02: 121ms -> 121ms [-1.7%, +2.45%]
✔️100_tpch_03: 1.11s -> 1.1s [-1.82%, +1.05%]
✔️100_tpch_04: 1.12s -> 1.09s [-8.19%, +3.01%]
✔️100_tpch_05: 1.21s -> 1.21s [-5.64%, +5.44%]
✔️100_tpch_06: 1.03s -> 1.03s [-1.49%, +0.67%]
🚀100_tpch_07: 1.19s -> 1.15s [-6.14%, -0.49%]
✔️100_tpch_08: 1.2s -> 1.2s [-1.83%, +1.8%]
✔️100_tpch_09: 1.27s -> 1.27s [-4.92%, +4.07%]
🚀100_tpch_10: 1.18s -> 1.12s [-7.32%, -1.72%]
✔️100_tpch_11: 81.2ms -> 88.1ms [-11.49%, +28.5%]
✔️100_tpch_12: 1.14s -> 1.13s [-3.86%, +1.21%]
✔️100_tpch_13: 320ms -> 327ms [-7.79%, +11.72%]
✔️100_tpch_14: 1.05s -> 1.05s [-1.79%, +0.81%]
✔️100_tpch_15: 1.17s -> 1.13s [-7.74%, +1.65%]
✔️100_tpch_16: 126ms -> 122ms [-7.95%, +1.5%]
✔️100_tpch_17: 1.12s -> 1.1s [-6.49%, +3.13%]
✔️100_tpch_18: 1.14s -> 1.14s [-4.54%, +3.7%]
✔️100_tpch_19: 1.22s -> 1.22s [-2.68%, +2.89%]
✔️100_tpch_20: 1.12s -> 1.11s [-3.42%, +1.72%]
✔️100_tpch_21: 2.28s -> 2.21s [-8.49%, +2.58%]
✔️100_tpch_22: 173ms -> 171ms [-7.49%, +5.25%]

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions · 2024-11-06T22:45:04Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 6afb054 is merged into main:

✔️001_tpch_01: 24.9ms -> 24.2ms [-7.05%, +1.02%]
✔️001_tpch_02: 137ms -> 137ms [-3.09%, +3.01%]
✔️001_tpch_03: 75ms -> 74.5ms [-2.93%, +1.44%]
🚀001_tpch_04: 25.7ms -> 24.3ms [-7.65%, -3.01%]
✔️001_tpch_05: 125ms -> 125ms [-1.92%, +2.1%]
✔️001_tpch_06: 15.4ms -> 15.1ms [-4.28%, +0.79%]
🚀001_tpch_07: 156ms -> 153ms [-2.83%, -0.59%]
✔️001_tpch_08: 172ms -> 171ms [-1.93%, +0.01%]
✔️001_tpch_09: 140ms -> 141ms [-0.42%, +3.08%]
✔️001_tpch_10: 83.2ms -> 82.8ms [-2.49%, +1.37%]
✔️001_tpch_11: 71.2ms -> 71.3ms [-1.7%, +2.11%]
❗🐌001_tpch_12: 30.2ms -> 62.8ms [+105.12%, +110.51%]
✔️001_tpch_13: 20.9ms -> 21.1ms [-1.28%, +3.14%]
🚀001_tpch_14: 22.9ms -> 22.4ms [-3.53%, -0.79%]
✔️001_tpch_15: 72.1ms -> 71.9ms [-3.27%, +2.61%]
✔️001_tpch_16: 69.8ms -> 69.9ms [-1.69%, +1.95%]
🚀001_tpch_17: 29ms -> 28.5ms [-3.18%, -0.28%]
✔️001_tpch_18: 24.6ms -> 24.4ms [-2.52%, +0.91%]
✔️001_tpch_19: 137ms -> 138ms [-1.37%, +1.71%]
✔️001_tpch_20: 86.6ms -> 85.1ms [-4.93%, +1.43%]
✔️001_tpch_21: 186ms -> 187ms [-1.14%, +1.84%]
✔️001_tpch_22: 137ms -> 138ms [-0.37%, +1.66%]
✔️010_tpch_01: 87ms -> 83.7ms [-10.38%, +2.63%]
✔️010_tpch_02: 76.4ms -> 76.3ms [-3.27%, +2.92%]
🚀010_tpch_03: 68.1ms -> 63.9ms [-11.2%, -1.03%]
🚀010_tpch_04: 49.2ms -> 45.6ms [-9%, -5.69%]
🚀010_tpch_05: 94.7ms -> 92.2ms [-3.77%, -1.36%]
✔️010_tpch_06: 35.5ms -> 34.8ms [-7.69%, +3.57%]
🚀010_tpch_07: 114ms -> 111ms [-4.6%, -0.86%]
🚀010_tpch_08: 133ms -> 131ms [-2.66%, -0.61%]
✔️010_tpch_09: 119ms -> 119ms [-2.29%, +1.55%]
🚀010_tpch_10: 84.7ms -> 78.7ms [-8.91%, -5.11%]
✔️010_tpch_11: 39.6ms -> 40.3ms [-2.93%, +6.7%]
🚀010_tpch_12: 63.2ms -> 59.9ms [-6.84%, -3.6%]
🚀010_tpch_13: 59.2ms -> 53.7ms [-17.47%, -0.9%]
🚀010_tpch_14: 42.4ms -> 40.5ms [-5.76%, -3.12%]
✔️010_tpch_15: 56.6ms -> 56.7ms [-6.91%, +6.96%]
✔️010_tpch_16: 46.4ms -> 45.9ms [-4.3%, +1.9%]
✔️010_tpch_17: 58.2ms -> 56.1ms [-9.66%, +2.4%]
✔️010_tpch_18: 57ms -> 56.7ms [-6.82%, +5.83%]
🚀010_tpch_19: 125ms -> 120ms [-5.93%, -1.65%]
✔️010_tpch_20: 74.9ms -> 73.2ms [-5.46%, +1.06%]
🚀010_tpch_21: 427ms -> 395ms [-10.14%, -4.76%]
✔️010_tpch_22: 78.9ms -> 80.5ms [-0.51%, +4.52%]
✔️100_tpch_01: 1.19s -> 1.22s [-0.4%, +4.86%]
✔️100_tpch_02: 125ms -> 126ms [-5.99%, +6.77%]
🚀100_tpch_03: 1.11s -> 1.09s [-3.01%, -0.5%]
✔️100_tpch_04: 1.08s -> 1.06s [-4.44%, +1.69%]
✔️100_tpch_05: 1.16s -> 1.17s [-1.24%, +3%]
✔️100_tpch_06: 1.02s -> 1.01s [-3.98%, +2.84%]
✔️100_tpch_07: 1.15s -> 1.13s [-4.35%, +0.68%]
✔️100_tpch_08: 1.15s -> 1.15s [-1.14%, +1.36%]
✔️100_tpch_09: 1.27s -> 1.26s [-5.59%, +3.06%]
✔️100_tpch_10: 1.15s -> 1.12s [-4.16%, +0.08%]
✔️100_tpch_11: 84.3ms -> 85.4ms [-5.72%, +8.22%]
✔️100_tpch_12: 1.12s -> 1.1s [-5.92%, +2.37%]
✔️100_tpch_13: 334ms -> 322ms [-14.29%, +7.21%]
✔️100_tpch_14: 1.04s -> 1.03s [-2.89%, +2.2%]
✔️100_tpch_15: 1.14s -> 1.11s [-6.44%, +2.35%]
✔️100_tpch_16: 133ms -> 131ms [-12.08%, +8.95%]
✔️100_tpch_17: 1.09s -> 1.09s [-3.48%, +2.56%]
✔️100_tpch_18: 1.11s -> 1.12s [-0.96%, +2.35%]
🚀100_tpch_19: 1.24s -> 1.2s [-4.71%, -1.12%]
✔️100_tpch_20: 1.1s -> 1.1s [-1.92%, +1.47%]
✔️100_tpch_21: 2.24s -> 2.2s [-7.52%, +3.64%]
✔️100_tpch_22: 175ms -> 174ms [-5.35%, +4.49%]

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions · 2024-11-10T12:02:37Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if beb517b is merged into main:

✔️001_tpch_01: 24.2ms -> 24.5ms [-2.72%, +4.94%]
✔️001_tpch_02: 130ms -> 129ms [-1.74%, +0.34%]
🚀001_tpch_03: 69.1ms -> 67.7ms [-3.61%, -0.25%]
✔️001_tpch_04: 24.5ms -> 23.9ms [-4.46%, +0.24%]
✔️001_tpch_05: 123ms -> 122ms [-2.21%, +1.23%]
🚀001_tpch_06: 14.9ms -> 14.3ms [-5.44%, -2.06%]
✔️001_tpch_07: 136ms -> 134ms [-3.37%, +0.25%]
🚀001_tpch_08: 154ms -> 152ms [-1.89%, -0.02%]
✔️001_tpch_09: 128ms -> 128ms [-1.11%, +1.38%]
✔️001_tpch_10: 73.8ms -> 72.8ms [-2.8%, +0.1%]
✔️001_tpch_11: 61ms -> 60.9ms [-1.6%, +1.31%]
❗🐌001_tpch_12: 28.5ms -> 55.8ms [+93.48%, +97.51%]
✔️001_tpch_13: 19.8ms -> 19.9ms [-0.96%, +1.63%]
✔️001_tpch_14: 21.4ms -> 21.1ms [-3.04%, +0.29%]
✔️001_tpch_15: 61.7ms -> 60.8ms [-3.37%, +0.15%]
✔️001_tpch_16: 64.2ms -> 64.1ms [-1.8%, +1.42%]
🚀001_tpch_17: 27.2ms -> 26.8ms [-2.45%, -0.19%]
✔️001_tpch_18: 23.3ms -> 23.2ms [-2.41%, +1.51%]
🚀001_tpch_19: 131ms -> 129ms [-2.91%, -0.35%]
✔️001_tpch_20: 76.7ms -> 76ms [-2.33%, +0.46%]
✔️001_tpch_21: 171ms -> 172ms [-1.61%, +2.23%]
✔️001_tpch_22: 123ms -> 123ms [-1.52%, +1.76%]
✔️010_tpch_01: 82.7ms -> 83.3ms [-3.68%, +5%]
✔️010_tpch_02: 71.1ms -> 71.4ms [-1.71%, +2.58%]
✔️010_tpch_03: 65.2ms -> 63.8ms [-6.4%, +2.09%]
🚀010_tpch_04: 48.1ms -> 45.4ms [-8.2%, -2.85%]
✔️010_tpch_05: 93.6ms -> 93.6ms [-3.48%, +3.5%]
🚀010_tpch_06: 34.8ms -> 33.2ms [-7.83%, -1.22%]
✔️010_tpch_07: 113ms -> 111ms [-6.51%, +2.09%]
🚀010_tpch_08: 132ms -> 128ms [-3.61%, -2.06%]
✔️010_tpch_09: 116ms -> 118ms [-2.36%, +5.85%]
🚀010_tpch_10: 83.1ms -> 77.6ms [-8.34%, -5.02%]
✔️010_tpch_11: 38.7ms -> 37.6ms [-6.71%, +0.94%]
🚀010_tpch_12: 62.2ms -> 59.4ms [-6.16%, -2.85%]
✔️010_tpch_13: 51.8ms -> 52.6ms [-1.07%, +4.06%]
🚀010_tpch_14: 42.1ms -> 38.9ms [-12.65%, -2.48%]
🚀010_tpch_15: 60.3ms -> 54.2ms [-16.31%, -3.85%]
✔️010_tpch_16: 45.6ms -> 45.1ms [-4.27%, +2.18%]
✔️010_tpch_17: 55.8ms -> 54.3ms [-5.29%, +0.16%]
✔️010_tpch_18: 54.3ms -> 57.8ms [-0.39%, +13.39%]
✔️010_tpch_19: 123ms -> 120ms [-7.43%, +3.58%]
🚀010_tpch_20: 74.7ms -> 68.6ms [-10.87%, -5.37%]
🚀010_tpch_21: 421ms -> 380ms [-11.47%, -7.92%]
✔️010_tpch_22: 77.3ms -> 78.1ms [-0.98%, +3%]
✔️100_tpch_01: 1.2s -> 1.16s [-6.6%, +0.89%]
✔️100_tpch_02: 120ms -> 120ms [-2.18%, +1.9%]
✔️100_tpch_03: 1.07s -> 1.05s [-5.59%, +1.13%]
✔️100_tpch_04: 1.04s -> 1.02s [-5.78%, +2.63%]
✔️100_tpch_05: 1.15s -> 1.13s [-3.78%, +1.06%]
🚀100_tpch_06: 999ms -> 973ms [-4.69%, -0.62%]
✔️100_tpch_07: 1.11s -> 1.1s [-3.95%, +1.72%]
✔️100_tpch_08: 1.13s -> 1.12s [-3.85%, +2.35%]
✔️100_tpch_09: 1.18s -> 1.2s [-1.33%, +3.91%]
✔️100_tpch_10: 1.11s -> 1.08s [-5.06%, +0.63%]
✔️100_tpch_11: 81.5ms -> 80.7ms [-5.09%, +2.99%]
✔️100_tpch_12: 1.08s -> 1.06s [-4.12%, +0.15%]
✔️100_tpch_13: 307ms -> 305ms [-5.43%, +4.01%]
🚀100_tpch_14: 992ms -> 974ms [-3.23%, -0.41%]
🚀100_tpch_15: 1.11s -> 1.06s [-8.19%, -0.81%]
✔️100_tpch_16: 122ms -> 125ms [-7.12%, +13.21%]
✔️100_tpch_17: 1.04s -> 1.04s [-3.53%, +2.11%]
✔️100_tpch_18: 1.07s -> 1.07s [-2.42%, +1.99%]
✔️100_tpch_19: 1.17s -> 1.16s [-5.19%, +2.48%]
✔️100_tpch_20: 1.05s -> 1.04s [-3.84%, +0.84%]
✔️100_tpch_21: 2.17s -> 2.21s [-3.09%, +6.66%]
✔️100_tpch_22: 173ms -> 171ms [-5.84%, +3.83%]

Further explanation regarding interpretation and methodology can be found in the documentation.

github-actions · 2024-11-12T20:28:25Z

This is how benchmark results would change (along with a 95% confidence interval in relative change) if 6ae7c03 is merged into main:

✔️001_tpch_01: 25.5ms -> 25ms [-6.78%, +2.98%]
✔️001_tpch_02: 132ms -> 131ms [-2.28%, +0.8%]
✔️001_tpch_03: 68.1ms -> 68.1ms [-1.71%, +1.74%]
✔️001_tpch_04: 24.9ms -> 24.3ms [-4.82%, +0.01%]
✔️001_tpch_05: 126ms -> 126ms [-1.36%, +1.06%]
✔️001_tpch_06: 16.6ms -> 16.2ms [-5.63%, +1.3%]
✔️001_tpch_07: 146ms -> 146ms [-2.56%, +1.65%]
✔️001_tpch_08: 166ms -> 165ms [-2.31%, +0.29%]
✔️001_tpch_09: 140ms -> 139ms [-2.43%, +0.73%]
✔️001_tpch_10: 81.9ms -> 80.6ms [-3.68%, +0.42%]
✔️001_tpch_11: 68.8ms -> 68.1ms [-3.3%, +1.12%]
❗🐌001_tpch_12: 30.3ms -> 61.6ms [+100.63%, +106.35%]
✔️001_tpch_13: 21.3ms -> 21.2ms [-3.71%, +2.63%]
✔️001_tpch_14: 22.5ms -> 22.4ms [-2.99%, +2.14%]
✔️001_tpch_15: 66.7ms -> 66.4ms [-2.82%, +1.86%]
✔️001_tpch_16: 70.1ms -> 69.8ms [-2.73%, +1.89%]
🚀001_tpch_17: 28.9ms -> 28.2ms [-3.81%, -0.92%]
✔️001_tpch_18: 24.2ms -> 23.9ms [-3.23%, +0.64%]
✔️001_tpch_19: 133ms -> 133ms [-1.14%, +1.35%]
✔️001_tpch_20: 82.5ms -> 82.1ms [-2.69%, +1.76%]
✔️001_tpch_21: 179ms -> 178ms [-1.76%, +1.14%]
✔️001_tpch_22: 128ms -> 128ms [-1.52%, +0.93%]
✔️010_tpch_01: 83.5ms -> 85.7ms [-6.02%, +11.41%]
✔️010_tpch_02: 73.3ms -> 73.5ms [-2.72%, +3.36%]
🚀010_tpch_03: 68.2ms -> 63.2ms [-12.35%, -2.49%]
🚀010_tpch_04: 49.9ms -> 45.7ms [-9.78%, -7.1%]
✔️010_tpch_05: 95.9ms -> 95ms [-2.84%, +0.81%]
✔️010_tpch_06: 36ms -> 35.8ms [-7.39%, +6.37%]
✔️010_tpch_07: 118ms -> 115ms [-5.77%, +1.06%]
🚀010_tpch_08: 135ms -> 134ms [-2.16%, -0.04%]
✔️010_tpch_09: 120ms -> 119ms [-2.43%, +1.68%]
🚀010_tpch_10: 88.7ms -> 79.7ms [-14.37%, -5.99%]
✔️010_tpch_11: 40.3ms -> 41.1ms [-2.49%, +6.4%]
🚀010_tpch_12: 66.2ms -> 60.9ms [-11.88%, -4.07%]
✔️010_tpch_13: 56.1ms -> 53.5ms [-12.7%, +3.38%]
🚀010_tpch_14: 43.1ms -> 40.7ms [-7.78%, -3.59%]
✔️010_tpch_15: 60.5ms -> 57.6ms [-10.66%, +1.15%]
✔️010_tpch_16: 45.4ms -> 45.5ms [-2.26%, +2.72%]
✔️010_tpch_17: 57.9ms -> 55.7ms [-7.95%, +0.09%]
✔️010_tpch_18: 54.6ms -> 55.2ms [-4.9%, +7.16%]
🚀010_tpch_19: 124ms -> 119ms [-5.95%, -1.57%]
🚀010_tpch_20: 76.2ms -> 70.4ms [-11.77%, -3.55%]
🚀010_tpch_21: 430ms -> 390ms [-11.69%, -6.95%]
✔️010_tpch_22: 78.7ms -> 79.2ms [-1.78%, +2.95%]
✔️100_tpch_01: 1.23s -> 1.24s [-2.57%, +5.28%]
✔️100_tpch_02: 129ms -> 132ms [-12.86%, +18.54%]
🚀100_tpch_03: 1.14s -> 1.1s [-5.75%, -1.33%]
✔️100_tpch_04: 1.11s -> 1.08s [-5.39%, +0.88%]
✔️100_tpch_05: 1.21s -> 1.2s [-3.65%, +2.54%]
✔️100_tpch_06: 1.01s -> 1.01s [-5.03%, +3.83%]
✔️100_tpch_07: 1.17s -> 1.17s [-5.5%, +4.99%]
✔️100_tpch_08: 1.18s -> 1.19s [-0.59%, +2.24%]
✔️100_tpch_09: 1.28s -> 1.27s [-4.45%, +4.06%]
✔️100_tpch_10: 1.17s -> 1.14s [-5.41%, +1.34%]
✔️100_tpch_11: 90.4ms -> 81.6ms [-26.98%, +7.62%]
✔️100_tpch_12: 1.13s -> 1.12s [-6.06%, +3.06%]
✔️100_tpch_13: 329ms -> 342ms [-6.1%, +14.5%]
✔️100_tpch_14: 1.03s -> 1.02s [-1.88%, 0%]
✔️100_tpch_15: 1.12s -> 1.12s [-5.91%, +4.54%]
✔️100_tpch_16: 124ms -> 126ms [-0.03%, +3.52%]
✔️100_tpch_17: 1.08s -> 1.07s [-3.29%, +0.27%]
✔️100_tpch_18: 1.11s -> 1.12s [-1.22%, +3.47%]
✔️100_tpch_19: 1.19s -> 1.18s [-3.37%, +1.62%]
✔️100_tpch_20: 1.12s -> 1.12s [-4.16%, +4.23%]
✔️100_tpch_21: 2.24s -> 2.22s [-5.68%, +3.61%]
✔️100_tpch_22: 172ms -> 172ms [-3.9%, +4.47%]

Further explanation regarding interpretation and methodology can be found in the documentation.

krlmlr mentioned this pull request Oct 19, 2024

Review TPCH runs #284

Open

toppyy force-pushed the comparison_expr branch from ec5b2af to 7799d7c Compare November 2, 2024 08:50

toppyy marked this pull request as ready for review November 2, 2024 08:53

krlmlr force-pushed the comparison_expr branch from 9c74416 to bfd1055 Compare November 6, 2024 11:46

krlmlr force-pushed the comparison_expr branch from bfd1055 to e8a5627 Compare November 6, 2024 17:23

toppyy added 13 commits November 10, 2024 05:22

introduce comparison expressions

7617408

handle non-existant/NULL argument 'data'

c47f3a9

use classes for type comparisons; prefer map over sapply

328e1fd

test for translation of comparison exprs

1a65314

apply styler

c39794a

use missing() instead of hasArg()

02c9f2a

fix merge resolution gone awry

3dcbfdc

integer and numeric classes are comparable

c9625da

fix reference to tmp_expr

34fe6a9

infer class only returns first class

18d6c56

use only the first class of column

076edba

eq operator is '=='

d3881e4

update snap for comparison exprs

f0ba24c

krlmlr force-pushed the comparison_expr branch from 6afb054 to f0ba24c Compare November 10, 2024 04:23

krlmlr added 5 commits November 10, 2024 05:39

Sync

8a78eb0

Test column-column comparison

61d3337

Flip order

2c478e7

Flip order upstream

1bb001c

Sync

7509e9f

krlmlr force-pushed the comparison_expr branch from 5c3d091 to 7509e9f Compare November 10, 2024 09:58

krlmlr mentioned this pull request Nov 10, 2024

introduce comparison expressions #330

Draft

krlmlr added 2 commits November 10, 2024 11:37

Compat

8d1f6c7

Sync

beb517b

krlmlr mentioned this pull request Nov 10, 2024

feat: mutate() constructs intermediate data frames for each new variable #332

Merged

don't call expr() on comp exprs with alias

6ae7c03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add translation to comparison expressions #270

Add translation to comparison expressions #270

toppyy commented Oct 6, 2024

krlmlr commented Oct 19, 2024

toppyy commented Oct 20, 2024

krlmlr commented Nov 3, 2024

toppyy commented Nov 4, 2024

krlmlr commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

krlmlr commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

toppyy commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 10, 2024

github-actions bot commented Nov 12, 2024

Add translation to comparison expressions #270

Are you sure you want to change the base?

Add translation to comparison expressions #270

Conversation

toppyy commented Oct 6, 2024

krlmlr commented Oct 19, 2024

toppyy commented Oct 20, 2024

krlmlr commented Nov 3, 2024

toppyy commented Nov 4, 2024

krlmlr commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

krlmlr commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

toppyy commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 6, 2024

github-actions bot commented Nov 10, 2024

github-actions bot commented Nov 12, 2024