Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop spark31x shims [databricks] #11159

Merged
merged 9 commits into from
Jul 12, 2024
Merged

Conversation

NvTimLiu
Copy link
Collaborator

@NvTimLiu NvTimLiu commented Jul 9, 2024

1, Remove spark31x json lines from the source code

2, Remove the files those only for spark31x shims

3, Move the files for spark31x and spark32x+ shims into sql-plugin/src/main/spark320 folder

4, Drop spark31x shims in the build scripts and pom files

1, Remove spark31x json lines from the source code

2, Remove the files those only for spark31x shims

3, Move the files for spark31x and spark32x+ shims into sql-plugin/src/main/spark320 folder

Signed-off-by: Tim Liu <[email protected]>
    tests/src/test/spark311/scala/com/nvidia/spark/rapids/shims/OrcStatisticShim.scala
     -->
    tests/src/test/spark321cdh/scala/com/nvidia/spark/rapids/shims/OrcStatisticShim.scala

check if we chan merge this file into?

    tests/src/test/spark320/scala/com/nvidia/spark/rapids/shims/OrcStatisticShim.scala
Signed-off-by: Tim Liu <[email protected]>
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scala2.13 pom files need to be regenerated to pick up the latest diffs.

There are still references to spark31x or 3.1.x in the following files that should be easy to update:

  • CONTRIBUTING.md
  • api_validation/README.md
  • api_validation/src/main/scala/com/nvidia/spark/rapids/api/ApiValidation.scala
  • dist/README.md
  • docs/dev/shims.md
  • jenkins/hadoop-def.sh
  • jenkins/spark-nightly-build.sh
  • jenkins/version-def.sh
  • sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala

There are a number of things that should be done as followups to this, and I'll file issues to track:

  • Some files no longer need to be treated as a shim after the removal of the spark31x shims (i.e.: code is now the same for all supported Spark versions).
  • Tests need to be updated to remove tests specific to Spark 3.1.x

@@ -44,7 +44,7 @@ import org.apache.spark.util.MutableURLClassLoader
3. a smaller fraction of classes that differ under one of the supported Spark versions
com/nvidia/spark/SQLPlugin.class
spark-shared/com/nvidia/spark/rapids/CastExprMeta.class
spark311/org/apache/spark/sql/rapids/GpuUnaryMinus.class
spark320/org/apache/spark/sql/rapids/GpuUnaryMinus.class
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change doesn't make sense given the context. This line is now duplicated with the one below. Probably need to pick a different example class to use here (can examine the contents of a dist jar after a buildall for a class that has a spark320 version and one other spark version)

Copy link
Collaborator Author

@NvTimLiu NvTimLiu Jul 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me, GpuLast.class should be good to replace current class here

Binary files ./spark331/org/apache/spark/sql/rapids/aggregate/GpuLast.class and ./spark320/org/apache/spark/sql/rapids/aggregate/GpuLast.class differ

Fixed

@gerashegalov
Copy link
Collaborator

Were you able to use shimplify.remove.shim per shims.md?

mvn generate-sources -Dshimplify=true -Dshimplify.move=true -Dshimplify.remove.shim=313 -pl tests,datagen -am -s jenkins/settings.xm

@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Jul 9, 2024

Were you able to use shimplify.remove.shim per shims.md?

mvn generate-sources -Dshimplify=true -Dshimplify.move=true -Dshimplify.remove.shim=313 -pl tests,datagen -am -s jenkins/settings.xm

Remove the 2 shims(e.g. 312+313) are good with the CLI, but when the last shim(311) failed

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-antrun-plugin:3.1.0:run (shimplify-shim-sources) on project rapids-4-spark-sql_2.12: An Ant BuildException has occured: Traceback (most recent call last):
[ERROR]   File "<script>", line 579, in <module>
[ERROR]   File "<script>", line 350, in task_impl
[ERROR]   File "<script>", line 478, in __shimplify_layout
[ERROR]   File "<script>", line 289, in __shell_exec
[ERROR]   File "<script>", line 111, in __fail
[ERROR]         at org.apache.tools.ant.taskdefs.optional.script.ScriptDefBase.fail(ScriptDefBase.java:129)
[ERROR]         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
[ERROR]         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
[ERROR]         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[ERROR]         at java.lang.reflect.Method.invoke(Method.java:498)
[ERROR] org.apache.tools.ant.BuildException: failed to execute ['git', 'rm', u'/home/timl/work.d/spark-rapids/sql-plugin/src/main/spark311/scala/org/apache/spark/sql/rapids/execution/ShimTrampolineUtil.scala']
[ERROR] 
[ERROR] around Ant part ...<shimplify if="shimplify" />... @ 7:33 in /home/timl/work.d/spark-rapids/sql-plugin/target/spark311/antrun/build-main.xml
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :rapids-4-spark-sql_2.12
'''

@pxLi pxLi added documentation Improvements or additions to documentation build Related to CI / CD or cleanly building labels Jul 10, 2024
@pxLi pxLi changed the title Drop spark31x shims Drop spark31x shims [databricks] Jul 10, 2024
@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Jul 10, 2024

scala2.13 pom files need to be regenerated to pick up the latest diffs.

There are still references to spark31x or 3.1.x in the following files that should be easy to update:

  • CONTRIBUTING.md
  • api_validation/README.md
  • api_validation/src/main/scala/com/nvidia/spark/rapids/api/ApiValidation.scala
  • dist/README.md
  • docs/dev/shims.md
  • jenkins/hadoop-def.sh
  • jenkins/spark-nightly-build.sh
  • jenkins/version-def.sh
  • sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala

There are a number of things that should be done as followups to this, and I'll file issues to track:

  • Some files no longer need to be treated as a shim after the removal of the spark31x shims (i.e.: code is now the same for all supported Spark versions).
  • Tests need to be updated to remove tests specific to Spark 3.1.x

Yes, I'll push a commit to update all the spark31x/3.1.x references --> Fixed

@gerashegalov
Copy link
Collaborator

Remove the 2 shims(e.g. 312+313) are good with the CLI, but when the last shim(311) failed

commit changes after every run

@pxLi
Copy link
Collaborator

pxLi commented Jul 10, 2024

@razajafri please discuss with Tim about #10994.
one of your PRs would require some refactoring work after another gets merged, thanks

    Change the default shim to spark320 from spark311
    in the shims in docs, source code and build scripts

Signed-off-by: Tim Liu <[email protected]>
@@ -8,7 +8,7 @@ parent: Developer Overview
# Shim Development

RAPIDS Accelerator For Apache Spark supports multiple feature version lines of
Apache Spark such as 3.1.x, 3.2.x, 3.3.0 and a number of vendor releases that contain
Apache Spark such as 3.2.x, 3.3.x, 3.4.x, 3.5.x and a number of vendor releases that contain
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the source code/classes that the Upstream base apache/spark classes we derive from to fix the incompatible shims building

@gerashegalov @jlowe @razajafri @revans2 @tgravescs Can you help to update this part in the file:docs/dev/shims.md ? Thanks a lot!

https://github.com/NVIDIA/spark-rapids/blob/878af8552fb2121ea716bd9522eaa3a677432b11/docs/dev/shims.md#compile-time-issues

### Compile-time issues
Upstream base classes we derive from might be incompatible in the sense that one version
requires us to implement/override the method `M` whereas the other prohibits it by marking
the base implementation `final`, E.g. `org.apache.spark.sql.catalyst.trees.TreeNode` changes
between Spark 3.1.x and Spark 3.2.x. So instead of deriving from such classes directly we
inject an intermediate trait e.g. `com.nvidia.spark.rapids.shims.ShimExpression` that
has a varying source code depending on the Spark version we compile against to overcome this
issue as you can see e.g., comparing TreeNode:
1. [ShimExpression For 3.1.x](https://github.com/NVIDIA/spark-rapids/blob/6a82213a798a81a5f32f8cf8b4c630e38d112f65/sql-plugin/src/main/spark311/scala/com/nvidia/spark/rapids/shims/TreeNode.scala#L28)
2. [ShimExpression For 3.2.x](https://github.com/NVIDIA/spark-rapids/blob/6a82213a798a81a5f32f8cf8b4c630e38d112f65/sql-plugin/src/main/spark320/scala/com/nvidia/spark/rapids/shims/TreeNode.scala#L37)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would simply leave it as is. I don't want to necessarily search for new examples of this problem. And the links are permalinks that will work even after this PR is merged.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since it's just an example demonstrating how the shims work, I'm fine leaving as-is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

@NvTimLiu
Copy link
Collaborator Author

NvTimLiu commented Jul 11, 2024

Remove the 2 shims(e.g. 312+313) are good with the CLI, but when the last shim(311) failed

commit changes after every run

@gerashegalov commit change after every shimplify run works, thanks for your guidance!

BTW, the result of the shimplify run and my manually update is the same in this PR.

jlowe
jlowe previously approved these changes Jul 11, 2024
Copy link
Member

@jlowe jlowe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's here looks good to me. In addition to the test cleanup (tracked by #11160) and the unshimming of classes that no longer need to be shims (tracked by #11161) , there are generated tool files under tools/generated/31* that should also be removed. I'm OK with this being a followup.

cc: @amahussein @tgravescs for awareness, as the main tools files are now showing Spark 3.2.0 support and the tools/generated/31* directories are going to be deleted soon.

@NvTimLiu
Copy link
Collaborator Author

build

@NvTimLiu
Copy link
Collaborator Author

build

To fix: NVIDIA#11175

Clean up unused and duplicated 'org/roaringbitmap' in the spark320 shim folder to walk around for the JACOCO error 'different class with same name', after we drop 31x shims and change the default shim to spark320

Signed-off-by: Tim Liu <[email protected]>
Copy link
Collaborator

@GaryShen2008 GaryShen2008 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build Related to CI / CD or cleanly building documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants