Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shuffle gpu serde #28

Open
wants to merge 25 commits into
base: 0625
Choose a base branch
from
Open

Commits on May 14, 2024

  1. Support serializing packed tables directly for shuffle write

    ---------
    
    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 14, 2024
    Configuration menu
    Copy the full SHA
    3a984f2 View commit details
    Browse the repository at this point in the history

Commits on May 16, 2024

  1. Disble GPU serde for the AQE tests

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 16, 2024
    Configuration menu
    Copy the full SHA
    baadb4b View commit details
    Browse the repository at this point in the history
  2. Disable by default

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 16, 2024
    Configuration menu
    Copy the full SHA
    11e933d View commit details
    Browse the repository at this point in the history

Commits on May 17, 2024

  1. Fix a build error

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 17, 2024
    Configuration menu
    Copy the full SHA
    6e8bb5c View commit details
    Browse the repository at this point in the history

Commits on May 20, 2024

  1. Address comments

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 20, 2024
    Configuration menu
    Copy the full SHA
    d6082ae View commit details
    Browse the repository at this point in the history
  2. Merge branch 'branch-24.06' of github.com:NVIDIA/spark-rapids into sh…

    …uffle-gpu-serde
    
    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 20, 2024
    Configuration menu
    Copy the full SHA
    0419224 View commit details
    Browse the repository at this point in the history

Commits on May 27, 2024

  1. Support buffering small tables for Shuffle read

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 27, 2024
    Configuration menu
    Copy the full SHA
    99820e1 View commit details
    Browse the repository at this point in the history

Commits on May 28, 2024

  1. Configuration menu
    Copy the full SHA
    9727161 View commit details
    Browse the repository at this point in the history

Commits on May 29, 2024

  1. Moving split batches to host by a single copying

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed May 29, 2024
    Configuration menu
    Copy the full SHA
    1bb4cfc View commit details
    Browse the repository at this point in the history

Commits on Jun 25, 2024

  1. Add GpuBucketingUtils shim to Spark 4.0.0 (NVIDIA#11092)

    * Add GpuBucketingUtils shim to Spark 4.0.0
    
    * Signing off
    
    Signed-off-by: Raza Jafri <[email protected]>
    
    ---------
    
    Signed-off-by: Raza Jafri <[email protected]>
    razajafri authored Jun 25, 2024
    Configuration menu
    Copy the full SHA
    b3b5b5e View commit details
    Browse the repository at this point in the history
  2. Improve the diagnostics for 'conv' fallback explain (NVIDIA#11076)

    * Improve the diagnostics for 'conv' fallback explain
    
    Signed-off-by: Jihoon Son <[email protected]>
    
    * don't use nil
    
    Signed-off-by: Jihoon Son <[email protected]>
    
    * the bases should not be an empty string in the error message when the user input is not
    
    Signed-off-by: Jihoon Son <[email protected]>
    
    * more user-friendly message
    
    * Update sql-plugin/src/main/scala/org/apache/spark/sql/rapids/stringFunctions.scala
    
    Co-authored-by: Gera Shegalov <[email protected]>
    
    ---------
    
    Signed-off-by: Jihoon Son <[email protected]>
    Co-authored-by: Gera Shegalov <[email protected]>
    jihoonson and gerashegalov authored Jun 25, 2024
    Configuration menu
    Copy the full SHA
    6455396 View commit details
    Browse the repository at this point in the history

Commits on Jun 26, 2024

  1. Disable ANSI mode for window function tests [databricks] (NVIDIA#11073)

    * Disable ANSI mode for window function tests.
    
    Fixes NVIDIA#11019.
    
    Window function tests fail on Spark 4.0 because of NVIDIA#5114 (and NVIDIA#5120 broadly),
    because spark-rapids does not support SUM, COUNT, and certain other aggregations
    in ANSI mode.
    
    This commit disables ANSI mode tests for the failing window function tests. These may be
    revisited, once error/overflow checking is available for ANSI mode in spark-rapids.
    
    Signed-off-by: MithunR <[email protected]>
    
    * Switch from @ansi_mode_disabled to @disable_ansi_mode.
    
    ---------
    
    Signed-off-by: MithunR <[email protected]>
    mythrocks authored Jun 26, 2024
    Configuration menu
    Copy the full SHA
    34e6bc8 View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2024

  1. Fix some test issues in Spark UT and keep RapidsTestSettings update-t…

    …o-date (NVIDIA#10997)
    
    * wip
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * fix json suite
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * wip
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * update
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * Remove all utc config and clean up
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * hardcode timezone to LA in ci
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * remove concat
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * remove spark timezone settings and only keep java timezone settings
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * remove unintensional comment
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * delete a comment
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * set timezone to utc for two suites to avoid fallback
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * style
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * after all
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * add cast string to timestamp back to exclude after upmerge
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    ---------
    
    Signed-off-by: Haoyang Li <[email protected]>
    thirtiseven authored Jun 27, 2024
    Configuration menu
    Copy the full SHA
    3cb54c4 View commit details
    Browse the repository at this point in the history
  2. exclude a case based on JDK version (NVIDIA#11083)

    Signed-off-by: Haoyang Li <[email protected]>
    thirtiseven authored Jun 27, 2024
    Configuration menu
    Copy the full SHA
    9dafc54 View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2024

  1. Replaced spark3xx-common references to spark-shared [databricks] (NVI…

    …DIA#11066)
    
    * Replaced spark3xx-common references to spark-shared
    
    * Signing off
    
    Signed-off-by: Raza Jafri <[email protected]>
    
    * addressed review comments
    
    * addressed review comments
    
    * removed todo as per review comment
    
    * Moving dependency to the related module because it was causing an error while running code coverage
    
    * Addressed review comments
    
    * Regenerated 2.13 poms
    
    ---------
    
    Signed-off-by: Raza Jafri <[email protected]>
    razajafri authored Jun 28, 2024
    Configuration menu
    Copy the full SHA
    3b6c5cd View commit details
    Browse the repository at this point in the history
  2. Fixed some cast_tests (NVIDIA#11049)

    Signed-off-by: Raza Jafri <[email protected]>
    razajafri authored Jun 28, 2024
    Configuration menu
    Copy the full SHA
    7dc52bc View commit details
    Browse the repository at this point in the history
  3. Fixed array_tests for Spark 4.0.0 [databricks] (NVIDIA#11048)

    * Fixed array_tests
    
    * Signing off
    
    Signed-off-by: Raza Jafri <[email protected]>
    
    * Disable ANSI for failing tests
    
    ---------
    
    Signed-off-by: Raza Jafri <[email protected]>
    razajafri authored Jun 28, 2024
    Configuration menu
    Copy the full SHA
    dd62000 View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2024

  1. Add a heuristic to skip second or third agg pass (NVIDIA#10950)

    * add a heristic to skip agg pass
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * commit doc change
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * refine naming
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * fix only reduction case
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * fix compile
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * fix
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * clean
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * fix doc
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * reduce premergeci2
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * reduce premergeci2, 2
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * use test_parallel to workaround flaky array test
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * address review comment
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * remove comma
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * workaround for  ci_scala213
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * disable agg ratio heruistic by default
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    * fix doc
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    
    ---------
    
    Signed-off-by: Hongbin Ma (Mahone) <[email protected]>
    binmahone authored Jun 29, 2024
    Configuration menu
    Copy the full SHA
    f954026 View commit details
    Browse the repository at this point in the history
  2. Support regex patterns with brackets when rewriting to PrefixRange pa…

    …ttern in rlike. (NVIDIA#11088)
    
    * Remove bracket when necessary in PrefixRange patten in Regex rewrite
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * add pytest cases
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    * fix scala 2.13 build
    
    Signed-off-by: Haoyang Li <[email protected]>
    
    ---------
    
    Signed-off-by: Haoyang Li <[email protected]>
    thirtiseven authored Jun 29, 2024
    Configuration menu
    Copy the full SHA
    2498204 View commit details
    Browse the repository at this point in the history

Commits on Jul 1, 2024

  1. Configuration menu
    Copy the full SHA
    f56fe2c View commit details
    Browse the repository at this point in the history
  2. Spark 4: Handle ANSI mode in sort_test.py (NVIDIA#11099)

    * Spark 4: Handle ANSI mode in sort_test.py
    
    Fixes NVIDIA#11027.
    
    With ANSI mode enabled (like the default in Spark 4), one sees that some
    tests in `sort_test.py` fail, because they expect ANSI mode to be off.
    
    This commit disables running those tests with ANSI enabled, and add a
    separate test for ANSI on/off.
    
    Signed-off-by: MithunR <[email protected]>
    
    * Refactored not to use disable_ansi_mode.
    
    These tests need not be revisited.  They test all combinations of ANSI mode,
    including overflow failures.
    
    Signed-off-by: MithunR <[email protected]>
    
    ---------
    
    Signed-off-by: MithunR <[email protected]>
    mythrocks authored Jul 1, 2024
    Configuration menu
    Copy the full SHA
    850365c View commit details
    Browse the repository at this point in the history

Commits on Jul 2, 2024

  1. Introduce LORE framework. (NVIDIA#11084)

    * Introduce lore id
    
    * Introduce lore id
    
    * Fix type
    
    * Fix type
    
    * Conf
    
    * style
    
    * part
    
    * Dump
    
    * Introduce lore framework
    
    * Add tests.
    
    * Rename test case
    
    Signed-off-by: liurenjie1024 <[email protected]>
    
    * Fix AQE test
    
    * Fix style
    
    * Use args to display lore info.
    
    * Fix build break
    
    * Fix path in loreinfo
    
    * Remove path
    
    * Fix comments
    
    * Update configs
    
    * Fix comments
    
    * Fix config
    
    ---------
    
    Signed-off-by: liurenjie1024 <[email protected]>
    liurenjie1024 authored Jul 2, 2024
    Configuration menu
    Copy the full SHA
    9bb295a View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    b52038e View commit details
    Browse the repository at this point in the history

Commits on Jul 4, 2024

  1. d1

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed Jul 4, 2024
    Configuration menu
    Copy the full SHA
    b4ea48f View commit details
    Browse the repository at this point in the history

Commits on Jul 15, 2024

  1. retry when copying data to host for merged buffers

    Signed-off-by: Firestarman <[email protected]>
    firestarman committed Jul 15, 2024
    Configuration menu
    Copy the full SHA
    8c4b318 View commit details
    Browse the repository at this point in the history