Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datadog admission labels cause failed submission #2367

Open
Marcus-Rosti opened this issue Dec 19, 2024 · 1 comment
Open

Datadog admission labels cause failed submission #2367

Marcus-Rosti opened this issue Dec 19, 2024 · 1 comment
Labels
kind/bug Something isn't working

Comments

@Marcus-Rosti
Copy link

What happened?

Adding

    spark.kubernetes.driver.label.admission.datadoghq.com/enabled: "true"
    spark.kubernetes.driver.annotation.admission.datadoghq.com/java-lib.version: "latest"
    spark.kubernetes.executor.label.admission.datadoghq.com/enabled: "true"
    spark.kubernetes.executor.annotation.admission.datadoghq.com/java-lib.version: "latest"

to spark submission results in this error

24/12/18 23:51:09 INFO SparkKubernetesClientFactory: Auto-configuring K8S client using current context from users K8S config file
24/12/18 23:51:09 INFO KerberosConfDriverFeatureStep: You have not specified a krb5.conf file locally or via a ConfigMap. Make sure that you have the krb5.conf locally on the driver image.
24/12/18 23:51:50 ERROR Client: Please check "kubectl auth can-i create pod" first. It should be yes.
Exception in thread "main" io.fabric8.kubernetes.client.KubernetesClientException: An error has occurred.
  at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:129)
  at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:122)
  at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:44)
  at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1108)
  at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:92)
  at org.apache.spark.deploy.k8s.submit.Client.run(KubernetesClientApplication.scala:153)
  at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6(KubernetesClientApplication.scala:256)
  at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.$anonfun$run$6$adapted(KubernetesClientApplication.scala:250)
  at org.apache.spark.util.SparkErrorUtils.tryWithResource(SparkErrorUtils.scala:48)
  at org.apache.spark.util.SparkErrorUtils.tryWithResource$(SparkErrorUtils.scala:46)
  at org.apache.spark.util.Utils$.tryWithResource(Utils.scala:94)
  at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.run(KubernetesClientApplication.scala:250)
  at org.apache.spark.deploy.k8s.submit.KubernetesClientApplication.start(KubernetesClientApplication.scala:223)
  at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
  at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
  at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
  at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
  at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
  at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
  at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.IOException: Canceled
  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:515)
  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:535)
  at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:340)
  at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:703)
  at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:92)
  at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)
  ... 17 more
Caused by: java.io.IOException: Canceled
  at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:121)
  at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147)
  at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121)
  at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:257)
  at okhttp3.RealCall$AsyncCall.execute(RealCall.java:201)
  at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
  at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
  at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
  at java.base/java.lang.Thread.run(Unknown Source)
24/12/18 23:51:50 INFO ShutdownHookManager: Shutdown hook called
24/12/18 23:51:50 INFO ShutdownHookManager: Deleting directory /tmp/spark-a00e6316-9de2-4bd7-b43a-02dbbf4527c9

Reproduction Code

No response

Expected behavior

No response

Actual behavior

No response

Environment & Versions

  • Kubernetes Version: v1.30.5-gke.1443001
  • Spark Operator Version: 2.1.0
  • Apache Spark Version: 3.5.3

Additional context

I'm following this tutorial: https://docs.datadoghq.com/data_jobs/kubernetes/?tab=datadogoperator

Impacted by this bug?

Give it a 👍 We prioritize the issues with most 👍

@Marcus-Rosti Marcus-Rosti added the kind/bug Something isn't working label Dec 19, 2024
@jacobsalway
Copy link
Member

Are you able to give your full SparkApplication spec? I'm not able to replicate this issue using an example app with the label/annotation sparkConf you've provided. From experience I suspect there are options being provided that are causing an invalid pod specification, which causes the pod creation request inside spark-submit to fail.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants