You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@anhpnv I see the image that I've declared on my Yaml, but it does seems that it kind "ignores" it.
Or I was wondering that I'm can having problems with my spark-operator image. I was trying to change my spark image due the fact that I'm facing this error
:: retrieving :: org.apache.spark#spark-submit-parent-105e130a-1f7d-4b90-afab-4f40b8c8044f
confs: [default]
98 artifacts copied, 0 already retrieved (318927kB/254ms)
24/12/09 20:58:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
at jdk.security.auth/com.sun.security.auth.UnixPrincipal.(Unknown Source)
at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(Unknown Source)
at java.base/javax.security.auth.login.LoginContext.invoke(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/java.security.AccessController.doPrivileged(Native Method)
But I'm wondering if that could be happening on spark-submit on spark-operator image
What question do you want to ask?
When I submit a SparkApplication even if I change my image and with ImagePullPolicy set to always it seems that the image isn't changed.
II know that the image ins't changing because I've set an image that does not exist and the job still ran with the old image.
Additional context
Here it is my .yaml
apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: pyspark
namespace: default
spec:
type: Python
mode: cluster
image: spark-nonexisting:3.5.3
imagePullPolicy: Always
mainApplicationFile: "s3://latrain-artifactory-us-east-1-/glue/scripts/l2/latrain-glue-job-universaldimensions-l2/l2-script.py"
pythonVersion: "3"
sparkVersion: "3.5.1"
hadoopConf:
"fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem"
"fs.AbstractFileSystem.s3a.impl": "org.apache.hadoop.fs.s3a.S3A"
"fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
"fs.s3a.access.key": ""
"fs.s3a.secret.key": ""
driver:
cores: 1
memory: "2g"
labels:
version: driver-v1
serviceAccount: spark
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- ip-100-64-27-26.ec2.internal
executor:
cores: 1
memory: "2g"
instances: 1
labels:
version: executor-v1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- ip-100-64-27-26.ec2.internal
sparkConf:
"spark.sql.parquet.datetimeRebaseModeInWrite": "legacy"
"spark.sql.storeAssignmentPolicy": "legacy"
"spark.sql.parquet.int96RebaseModeInWrite": "legacy"
"spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension"
"spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"
"spark.shuffle.glue.s3ShuffleBucket": "s3://latrain-temporary-us-east-1-647076108795-dev/glue/temporary/l2/latrain-glue-job-universaldimensions-l2/shuffle/"
"spark.kubernetes.authenticate.driver.serviceAccountName": "spark-service-account"
"spark.kubernetes.authenticate.executor.serviceAccountName": "spark-service-account"
"spark.jars.packages": "org.apache.hadoop:hadoop-aws:3.3.4,org.apache.hadoop:hadoop-common:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262,io.delta:delta-spark_2.12:3.0.0"
"spark.jars.ivy": "/tmp/.ivy"
deps:
jars:
- s3://latrain-artifactory-us-east-1-647076108795-dev/glue/library/l2/latrain-glue-job-universaldimensions-l2/openlineage-spark_2.12-1.13.1.jar
pyFiles:
- s3://latrain-artifactory-us-east-1/lakelib/dist/lakelib_core-0.4.0-py3-none-any.whl
- s3://latrain-artifactory-us-east-1/lakelib/dist/lakelib_glue-0.4.0-py3-none-any.whl
arguments:
- "5000"
- --data_lineage_url
- http://............
- --s3_bucket_temporary_name
- latrain-temporary-us-east-1
- --s3_bucket_artifactory_name
- latrain-artifactory-us-east-1
- --s3_bucket_l2_name
- latrain-l2-us-east-1
- --glue_database_l2_name
- latrain_l2
- --data_lineage
Have the same question?
Give it a 👍 We prioritize the question with most 👍
The text was updated successfully, but these errors were encountered: