Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ImagePull does not work #2349

Open
1 task done
paulorangeljr opened this issue Dec 6, 2024 · 2 comments
Open
1 task done

ImagePull does not work #2349

paulorangeljr opened this issue Dec 6, 2024 · 2 comments

Comments

@paulorangeljr
Copy link

paulorangeljr commented Dec 6, 2024

What question do you want to ask?

  • ✋ I have searched the open/closed issues and my issue is not listed.

When I submit a SparkApplication even if I change my image and with ImagePullPolicy set to always it seems that the image isn't changed.

II know that the image ins't changing because I've set an image that does not exist and the job still ran with the old image.

Additional context

Here it is my .yaml

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
name: pyspark
namespace: default
spec:
type: Python
mode: cluster
image: spark-nonexisting:3.5.3
imagePullPolicy: Always
mainApplicationFile: "s3://latrain-artifactory-us-east-1-/glue/scripts/l2/latrain-glue-job-universaldimensions-l2/l2-script.py"
pythonVersion: "3"
sparkVersion: "3.5.1"
hadoopConf:
"fs.s3.impl": "org.apache.hadoop.fs.s3a.S3AFileSystem"
"fs.AbstractFileSystem.s3a.impl": "org.apache.hadoop.fs.s3a.S3A"
"fs.s3a.aws.credentials.provider": "org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider"
"fs.s3a.access.key": ""
"fs.s3a.secret.key": ""
driver:
cores: 1
memory: "2g"
labels:
version: driver-v1
serviceAccount: spark
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- ip-100-64-27-26.ec2.internal
executor:
cores: 1
memory: "2g"
instances: 1
labels:
version: executor-v1
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- ip-100-64-27-26.ec2.internal
sparkConf:
"spark.sql.parquet.datetimeRebaseModeInWrite": "legacy"
"spark.sql.storeAssignmentPolicy": "legacy"
"spark.sql.parquet.int96RebaseModeInWrite": "legacy"
"spark.sql.extensions": "io.delta.sql.DeltaSparkSessionExtension"
"spark.sql.catalog.spark_catalog": "org.apache.spark.sql.delta.catalog.DeltaCatalog"
"spark.shuffle.glue.s3ShuffleBucket": "s3://latrain-temporary-us-east-1-647076108795-dev/glue/temporary/l2/latrain-glue-job-universaldimensions-l2/shuffle/"
"spark.kubernetes.authenticate.driver.serviceAccountName": "spark-service-account"
"spark.kubernetes.authenticate.executor.serviceAccountName": "spark-service-account"
"spark.jars.packages": "org.apache.hadoop:hadoop-aws:3.3.4,org.apache.hadoop:hadoop-common:3.3.4,com.amazonaws:aws-java-sdk-bundle:1.12.262,io.delta:delta-spark_2.12:3.0.0"
"spark.jars.ivy": "/tmp/.ivy"
deps:
jars:
- s3://latrain-artifactory-us-east-1-647076108795-dev/glue/library/l2/latrain-glue-job-universaldimensions-l2/openlineage-spark_2.12-1.13.1.jar
pyFiles:
- s3://latrain-artifactory-us-east-1/lakelib/dist/lakelib_core-0.4.0-py3-none-any.whl
- s3://latrain-artifactory-us-east-1/lakelib/dist/lakelib_glue-0.4.0-py3-none-any.whl
arguments:
- "5000"
- --data_lineage_url
- http://............
- --s3_bucket_temporary_name
- latrain-temporary-us-east-1
- --s3_bucket_artifactory_name
- latrain-artifactory-us-east-1
- --s3_bucket_l2_name
- latrain-l2-us-east-1
- --glue_database_l2_name
- latrain_l2
- --data_lineage

Have the same question?

Give it a 👍 We prioritize the question with most 👍

@anhpnv
Copy link

anhpnv commented Dec 9, 2024

Did you check with this below command, and what image when you saw it ?

kubectl describe sparkapplication pyspark

@paulorangeljr
Copy link
Author

@anhpnv I see the image that I've declared on my Yaml, but it does seems that it kind "ignores" it.

Or I was wondering that I'm can having problems with my spark-operator image. I was trying to change my spark image due the fact that I'm facing this error

:: retrieving :: org.apache.spark#spark-submit-parent-105e130a-1f7d-4b90-afab-4f40b8c8044f
confs: [default]
98 artifacts copied, 0 already retrieved (318927kB/254ms)
24/12/09 20:58:08 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.hadoop.security.KerberosAuthException: failure to login: javax.security.auth.login.LoginException: java.lang.NullPointerException: invalid null input: name
at jdk.security.auth/com.sun.security.auth.UnixPrincipal.(Unknown Source)
at jdk.security.auth/com.sun.security.auth.module.UnixLoginModule.login(Unknown Source)
at java.base/javax.security.auth.login.LoginContext.invoke(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/javax.security.auth.login.LoginContext$4.run(Unknown Source)
at java.base/java.security.AccessController.doPrivileged(Native Method)

But I'm wondering if that could be happening on spark-submit on spark-operator image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants