Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Component not found when mvn test after mvn install -P iceberg #8303

Open
zhztheplayer opened this issue Dec 23, 2024 · 8 comments · May be fixed by #8310
Open

[VL] Component not found when mvn test after mvn install -P iceberg #8303

zhztheplayer opened this issue Dec 23, 2024 · 8 comments · May be fixed by #8310
Labels
bug Something isn't working triage

Comments

@zhztheplayer
Copy link
Member

zhztheplayer commented Dec 23, 2024

Reported by @Yohahaha.

Error:

24/12/23 11:54:16 WARN SparkSession: Cannot use org.apache.gluten.extension.GlutenSessionExtensions to configure session extensions.
java.lang.NoClassDefFoundError: org/apache/gluten/execution/OffloadIcebergScan$                                                
        at org.apache.gluten.component.VeloxIcebergComponent.injectRules(VeloxIcebergComponent.scala:29)                       
        at org.apache.gluten.extension.GlutenSessionExtensions.$anonfun$apply$6(GlutenSessionExtensions.scala:51)              
        at org.apache.gluten.extension.GlutenSessionExtensions.$anonfun$apply$6$adapted(GlutenSessionExtensions.scala:51)
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)                                            
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)                                           
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)                                                  
        at org.apache.gluten.extension.GlutenSessionExtensions.apply(GlutenSessionExtensions.scala:51)                         
        at org.apache.gluten.extension.GlutenSessionExtensions.apply(GlutenSessionExtensions.scala:26)                         
        at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1(SparkSession.scala:1224)                              
        at org.apache.spark.sql.SparkSession$.$anonfun$applyExtensions$1$adapted(SparkSession.scala:1219)                      
        at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)                                            
        at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)                                           
        at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)                                                  
        at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$applyExtensions(SparkSession.scala:1219)
        at org.apache.spark.sql.SparkSession.<init>(SparkSession.scala:106)                                                    
        at org.apache.spark.sql.SparkSession.<init>(SparkSession.scala:109)                                                    
        at org.apache.spark.sql.test.TestSparkSession.<init>(TestSQLContext.scala:27)                                          
        at org.apache.spark.sql.test.TestSparkSession.<init>(TestSQLContext.scala:30)                                          
        at org.apache.spark.sql.test.SharedSparkSessionBase.createSparkSession(SharedSparkSession.scala:102)                   
        at org.apache.spark.sql.test.SharedSparkSessionBase.createSparkSession$(SharedSparkSession.scala:100)                  
        at org.apache.gluten.execution.WholeStageTransformerSuite.createSparkSession(WholeStageTransformerSuite.scala:37)
        at org.apache.spark.sql.test.SharedSparkSessionBase.initializeSession(SharedSparkSession.scala:116)                    
        at org.apache.spark.sql.test.SharedSparkSessionBase.initializeSession$(SharedSparkSession.scala:114)                   
        at org.apache.gluten.execution.WholeStageTransformerSuite.initializeSession(WholeStageTransformerSuite.scala:37)
        at org.apache.spark.sql.test.SharedSparkSessionBase.beforeAll(SharedSparkSession.scala:124)                            
        at org.apache.spark.sql.test.SharedSparkSessionBase.beforeAll$(SharedSparkSession.scala:123)                           
        at org.apache.gluten.execution.WholeStageTransformerSuite.org$apache$spark$sql$test$SharedSparkSession$$super$beforeAll(WholeStageTransformerSuite.scala:37)
        at org.apache.spark.sql.test.SharedSparkSession.beforeAll(SharedSparkSession.scala:46)                                 
        at org.apache.spark.sql.test.SharedSparkSession.beforeAll$(SharedSparkSession.scala:44)                                
        at org.apache.gluten.execution.WholeStageTransformerSuite.beforeAll(WholeStageTransformerSuite.scala:67)               
        at org.apache.gluten.execution.VeloxTPCHTableSupport.beforeAll(VeloxTPCHSuite.scala:55)                                
        at org.scalatest.BeforeAndAfterAll.liftedTree1$1(BeforeAndAfterAll.scala:212)                                          
        at org.scalatest.BeforeAndAfterAll.run(BeforeAndAfterAll.scala:210)                                                    
        at org.scalatest.BeforeAndAfterAll.run$(BeforeAndAfterAll.scala:208)                                                   
        at org.apache.spark.SparkFunSuite.run(SparkFunSuite.scala:64)                                                          
        at org.scalatest.Suite.callExecuteOnSuite$1(Suite.scala:1178)                                                          
        at org.scalatest.Suite.$anonfun$runNestedSuites$1(Suite.scala:1225)                                                    
        at scala.collection.IndexedSeqOptimized.foreach(IndexedSeqOptimized.scala:36)                                          
        at scala.collection.IndexedSeqOptimized.foreach$(IndexedSeqOptimized.scala:33)                                         
        at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:198)                                                 
        at org.scalatest.Suite.runNestedSuites(Suite.scala:1223)                                                               
        at org.scalatest.Suite.runNestedSuites$(Suite.scala:1156)                                                              
        at org.scalatest.tools.DiscoverySuite.runNestedSuites(DiscoverySuite.scala:30)                                         
        at org.scalatest.Suite.run(Suite.scala:1111)           
        at org.scalatest.Suite.run$(Suite.scala:1096)          
        at org.scalatest.tools.DiscoverySuite.run(DiscoverySuite.scala:30)                                                     
        at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:47)                                                           
        at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13(Runner.scala:1321)                                    
        at org.scalatest.tools.Runner$.$anonfun$doRunRunRunDaDoRunRun$13$adapted(Runner.scala:1315)                            
        at scala.collection.immutable.List.foreach(List.scala:431)                                                             
        at org.scalatest.tools.Runner$.doRunRunRunDaDoRunRun(Runner.scala:1315)                                                
        at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24(Runner.scala:992)                         
        at org.scalatest.tools.Runner$.$anonfun$runOptionallyWithPassFailReporter$24$adapted(Runner.scala:970)                 
        at org.scalatest.tools.Runner$.withClassLoaderAndDispatchReporter(Runner.scala:1481)                                   
        at org.scalatest.tools.Runner$.runOptionallyWithPassFailReporter(Runner.scala:970)                                     
        at org.scalatest.tools.Runner$.main(Runner.scala:775)                                                                  
        at org.scalatest.tools.Runner.main(Runner.scala)       
Caused by: java.lang.ClassNotFoundException: org.apache.gluten.execution.OffloadIcebergScan$                                   
        at java.net.URLClassLoader.findClass(URLClassLoader.java:387)                                                          
        at java.lang.ClassLoader.loadClass(ClassLoader.java:418)                                                               
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)                                                       
        at java.lang.ClassLoader.loadClass(ClassLoader.java:351)                                                               
        ... 57 more

To repeat:

mvn -P backends-velox,spark-3.3,scala-2.12,delta,iceberg,uniffle,celeborn clean install -Dscalastyle.skip=true -Dcheckstyle.skip=true -Dspotless.check.skip=true -DskipTests
mvn -P backends-velox,spark-3.3,scala-2.12 test

This is a regression due to #8192.

@zhztheplayer zhztheplayer added bug Something isn't working triage labels Dec 23, 2024
@Yohahaha
Copy link
Contributor

mvn clean package -P backends-velox,spark-3.4,hadoop-3.2,delta,iceberg,hudi -DskipTests
mvn test -P backends-velox,spark-3.4,hadoop-3.2,spark-ut

@zhztheplayer
Copy link
Member Author

@Yohahaha Had confirmed that the compiled uber jar gluten-package-<version>.jar doesn't have this issue by running some integration tests locally. This should only be a problem when running mvn test. Let me know if you encountered similar issues with the uber jar.

@Yohahaha
Copy link
Contributor

Let me know if you encountered similar issues with the uber jar.

how to run tests with uber jar?

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Dec 23, 2024

gluten-it is using the uber jar as dependency, though one should mvn install the uber jar into local maven repo in advance:

cd gluten/
mvn clean package -P backends-velox,spark-3.4,hadoop-3.2,delta,iceberg,hudi -DskipTests
cd tools/gluten-it/
mvn clean install -P spark-3.4
sbin/gluten-it.sh queries --local --data-gen=once --queries=q1

@Yohahaha
Copy link
Contributor

gluten-it is using the uber jar as dependency, though one should mvn install the uber jar into local maven repo in advance:

cd gluten/
mvn clean package -P backends-velox,spark-3.4,hadoop-3.2,delta,iceberg,hudi -DskipTests
cd tools/gluten-it/
mvn clean install -P spark-3.4
sbin/gluten-it.sh queries --local --data-gen=once --queries=q1

thank you, I got what you mean, will test this way later.

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Dec 23, 2024

Hi @Yohahaha,

Thanks! for the mvn test issue, since it only applies to Maven test phase, would you think it's feasible to you to pass a Maven property to disable the corresponding components as a quick solution?

Something like:

mvn clean package -P backends-velox,spark-3.4,hadoop-3.2,delta,iceberg,hudi -DskipTests
mvn test -P backends-velox,spark-3.4,hadoop-3.2,spark-ut -DargLine="-Dspark.gluten.component.exclusions=velox-iceberg,velox-hudi,velox-delta"

I can raise a PR as quickly as I can if it sounds OK.

@zhztheplayer
Copy link
Member Author

zhztheplayer commented Dec 23, 2024

Hi @Yohahaha,

Thanks! for the mvn test issue, since it only applies to Maven test phase, would you think it's feasible to you to pass a Maven property to disable the corresponding components as a quick solution?

Something like:

mvn clean package -P backends-velox,spark-3.4,hadoop-3.2,delta,iceberg,hudi -DskipTests
mvn test -P backends-velox,spark-3.4,hadoop-3.2,spark-ut -DargLine="-Dspark.gluten.component.exclusions=velox-iceberg,velox-hudi,velox-delta"

I can raise a PR as quickly as I can if it sounds OK.

Also, spark.gluten.component.exclusions could become a formal toggle to disable the other plugged-in features at runtime, e.g., hudi, delta, etc.

@zhztheplayer
Copy link
Member Author

Another solutions is to assign -P iceberg with mvn test as well, however I think this way doesn't provide the flexibility to build all and test part of the modules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working triage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants