Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Databricks Connect PyCharm NoClassDefFoundError #165

Open
giusbi opened this issue Apr 24, 2019 · 0 comments
Open

Databricks Connect PyCharm NoClassDefFoundError #165

giusbi opened this issue Apr 24, 2019 · 0 comments

Comments

@giusbi
Copy link

giusbi commented Apr 24, 2019

After having followed the documentation to connect Databricks to Pycharm, I am not able to run the sample example in https://docs.azuredatabricks.net/user-guide/dev-tools/db-connect.html#run-examples-from-your-ide car I get an error. Notice that the connection seem to work car at the beginning is checking the cluster status and is executing it; after that the error occurs on the spark command execution

19/04/24 15:07:06 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
19/04/24 15:07:08 WARN MetricsSystem: Using default name SparkStatusTracker for source because neither spark.metrics.namespace nor spark.app.id is set.
Testing simple count
19/04/24 15:07:10 WARN HTTPClient: Setting proxy configuration for HTTP client based on env var HTTPS_PROXY=https://proxy_name
19/04/24 15:07:13 WARN SparkClientManager: Cluster 1108-095209-xxx in state PENDING, waiting for it to start running...
19/04/24 15:07:24 WARN SparkClientManager: Cluster 1108-095209-xxx in state PENDING, waiting for it to start running...
19/04/24 15:07:34 WARN SparkClientManager: Cluster 1108-095209-xxx in state PENDING, waiting for it to start running...
Traceback (most recent call last):
  File "C:/Users/my_name/PycharmProjects/Databricks/main.py", line 7, in <module>
    print(spark.range(100).count())
  File "C:\Users\my_name\AppData\Local\Continuum\anaconda3\envs\dbconnect\lib\site-packages\pyspark\sql\session.py", line 337, in range
    jdf = self._jsparkSession.range(0, int(start), int(step), int(numPartitions))
  File "C:\Users\my_name\AppData\Local\Continuum\anaconda3\envs\dbconnect\lib\site-packages\py4j\java_gateway.py", line 1257, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "C:\Users\my_name\AppData\Local\Continuum\anaconda3\envs\dbconnect\lib\site-packages\pyspark\sql\utils.py", line 63, in deco
    return f(*a, **kw)
  File "C:\Users\my_name\AppData\Local\Continuum\anaconda3\envs\dbconnect\lib\site-packages\py4j\protocol.py", line 328, in get_return_value
    format(target_id, ".", name), value)
py4j.protocol.Py4JJavaError: An error occurred while calling o20.range.
: java.lang.NoClassDefFoundError: com/trueaccord/scalapb/GeneratedMessage
	at java.lang.ClassLoader.defineClass1(Native Method)
	at java.lang.ClassLoader.defineClass(Unknown Source)
	at java.security.SecureClassLoader.defineClass(Unknown Source)
	at java.net.URLClassLoader.defineClass(Unknown Source)
	at java.net.URLClassLoader.access$100(Unknown Source)
	at java.net.URLClassLoader$1.run(Unknown Source)
	at java.net.URLClassLoader$1.run(Unknown Source)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	at com.databricks.service.SparkServiceRPCClientStub.com$databricks$service$SparkServiceRPCClientStub$$buildRpc(SparkServiceRPCClientStub.scala:352)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollStatuses$1.apply(SparkServiceRPCClientStub.scala:458)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollStatuses$1.apply(SparkServiceRPCClientStub.scala:457)
	at com.databricks.spark.util.Log4jUsageLogger.recordOperation(UsageLogger.scala:161)
	at com.databricks.spark.util.UsageLogging$class.recordOperation(UsageLogger.scala:286)
	at com.databricks.service.SparkServiceRPCClientStub.recordOperation(SparkServiceRPCClientStub.scala:48)
	at com.databricks.service.SparkServiceRPCClientStub.pollStatuses(SparkServiceRPCClientStub.scala:457)
	at com.databricks.service.SparkServiceRPCClientStub.com$databricks$service$SparkServiceRPCClientStub$$pollAndUpdateStatuses0(SparkServiceRPCClientStub.scala:428)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SparkServiceRPCClientStub.scala:409)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1$$anonfun$apply$mcV$sp$1.apply(SparkServiceRPCClientStub.scala:407)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1$$anonfun$apply$mcV$sp$1.apply(SparkServiceRPCClientStub.scala:407)
	at com.databricks.service.SparkServiceRPCClientStub.com$databricks$service$SparkServiceRPCClientStub$$withPollLock(SparkServiceRPCClientStub.scala:419)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1.apply$mcV$sp(SparkServiceRPCClientStub.scala:406)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1.apply(SparkServiceRPCClientStub.scala:404)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$pollAndUpdateStatuses$1.apply(SparkServiceRPCClientStub.scala:404)
	at com.databricks.spark.util.Log4jUsageLogger.recordOperation(UsageLogger.scala:161)
	at com.databricks.spark.util.UsageLogging$class.recordOperation(UsageLogger.scala:286)
	at com.databricks.service.SparkServiceRPCClientStub.recordOperation(SparkServiceRPCClientStub.scala:48)
	at com.databricks.service.SparkServiceRPCClientStub.pollAndUpdateStatuses(SparkServiceRPCClientStub.scala:404)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$getServerHadoopConf$1.apply(SparkServiceRPCClientStub.scala:382)
	at com.databricks.service.SparkServiceRPCClientStub$$anonfun$getServerHadoopConf$1.apply(SparkServiceRPCClientStub.scala:381)
	at com.databricks.service.SparkServiceRPCClientStub.com$databricks$service$SparkServiceRPCClientStub$$withPollLock(SparkServiceRPCClientStub.scala:419)
	at com.databricks.service.SparkServiceRPCClientStub.getServerHadoopConf(SparkServiceRPCClientStub.scala:381)
	at com.databricks.service.SparkClient$.getServerHadoopConf(SparkClient.scala:211)
	at com.databricks.spark.util.SparkClientContext$.getServerHadoopConf(SparkClientContext.scala:217)
	at org.apache.spark.SparkContext$$anonfun$hadoopConfiguration$1.apply(SparkContext.scala:316)
	at org.apache.spark.SparkContext$$anonfun$hadoopConfiguration$1.apply(SparkContext.scala:311)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58)
	at org.apache.spark.SparkContext.hadoopConfiguration(SparkContext.scala:310)
	at org.apache.spark.sql.internal.SharedState.<init>(SharedState.scala:66)
	at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:145)
	at org.apache.spark.sql.SparkSession$$anonfun$sharedState$1.apply(SparkSession.scala:145)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession.sharedState$lzycompute(SparkSession.scala:145)
	at org.apache.spark.sql.SparkSession.sharedState(SparkSession.scala:144)
	at org.apache.spark.sql.internal.BaseSessionStateBuilder.build(BaseSessionStateBuilder.scala:291)
	at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1175)
	at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:170)
	at org.apache.spark.sql.SparkSession$$anonfun$sessionState$2.apply(SparkSession.scala:169)
	at scala.Option.getOrElse(Option.scala:121)
	at org.apache.spark.sql.SparkSession.sessionState$lzycompute(SparkSession.scala:169)
	at org.apache.spark.sql.SparkSession.sessionState(SparkSession.scala:166)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:193)
	at org.apache.spark.sql.SparkSession.range(SparkSession.scala:609)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
	at java.lang.reflect.Method.invoke(Unknown Source)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:380)
	at py4j.Gateway.invoke(Gateway.java:295)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:251)
	at java.lang.Thread.run(Unknown Source)
Caused by: java.lang.ClassNotFoundException: com.trueaccord.scalapb.GeneratedMessage
	at java.net.URLClassLoader.findClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	at sun.misc.Launcher$AppClassLoader.loadClass(Unknown Source)
	at java.lang.ClassLoader.loadClass(Unknown Source)
	... 67 more


Process finished with exit code 1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant