Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zdt-upgrade - deployment should not have initContainer on upgrade job #641

Open
charoensri opened this issue Sep 29, 2023 · 1 comment
Open
Labels
enhancement New feature or request

Comments

@charoensri
Copy link

charoensri commented Sep 29, 2023

Describe the bug
Should the deployment yaml has a dependency on initContainer - upgrade job at end of the upgrad-deploy action?

I tried upgradeType: "zero-downtime" on docker desktop and crashed before post upgrade job.
I noticed that the deployment spec.template with the upgrade-deploy action has the initContainer generated
initContainers:
- name: wait-for-pegaupgrade
image: pegasystems/k8s-wait-for
imagePullPolicy: IfNotPresent
args: [ 'job', 'pega-zdt-upgrade']

To Reproduce

Expected behavior

Source: pega/templates/pega-tier-deployment.yaml

should the initContainers with zdt upgrade job remained in Deployment spec after the upgrade? or because my docker desktop crashed just before the the post upgrade job started.
NOTE: both pre and zdt upgrade completed. DB was upgraded successfully and the NEW replicaset recycled the pods without issues. It is OK from the application and upgrade perspective. However the deployment spec (so the pod spec) now has the initContainer in it. Once I deleted the upgrade job, a new pod will fail to start due to the intiContainer failure.
I fixed this up by having another helm upgrade using the deploy action only with the new rules schema.

Chart version
I clone and use this chart locally
https://github.com/pegasystems/pega-helm-charts/blob/master/charts/pega/Chart.yaml

apiVersion: v1
name: pega
version: "1.2.0"
description: Pega installation on kubernetes
keywords:

Server (if applicable, please complete the following information):
postgreSQL, docker desktop

Additional context

Source: pega/templates/pega-tier-deployment.yaml

kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
name: pega-dockerdesktop-web
namespace: pega883
labels:
app: pega-dockerdesktop-web
component: Pega
spec:

Replicas specify the number of copies for pega-dockerdesktop-web

replicas: 1
progressDeadlineSeconds: 2147483647
selector:
matchLabels:
app: pega-dockerdesktop-web
strategy:
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
type: RollingUpdate
template:
metadata:
labels:
app: pega-dockerdesktop-web
annotations:
config-check: 41181778004bd56b9c2cf77c7d9e9bdec0eb73e9e5980e0d95159dc69621efac
config-tier-check: 7060cc4a89b2696a22ccca2b06eb060f204cb55ff63a90469297eeffec62c403
certificate-check: 2cb1f675c5f532bd68c3851872bf42719f0516208049d403a84068dac54c695c

spec:      
  
  volumes:
  # Volume used to mount config files.
  - name: pega-volume-config
    configMap:
      # This name will be referred in the volume mounts kind.
      name: pega-dockerdesktop-web
      # Used to specify permissions on files within the volume.
      defaultMode: 420      
  - name: pega-volume-credentials
    projected:
      defaultMode: 420
      sources:    
      - secret:
          name: seri-pega-secrets    
      - secret:
          name: pega-hz-secret    
      - secret:
          name: pega-stream-secret    
      - secret:
          name: pega-dds-secret    
  
      - secret:
          name: pega-diagnostic-secret


  initContainers:
  - name: wait-for-pegaupgrade
    image: pegasystems/k8s-wait-for
    imagePullPolicy: IfNotPresent
    args: [ 'job', 'pega-zdt-upgrade']
    env:
    - name: WAIT_TIME
      value: "2"
    - name: MAX_RETRIES
      value: "1"
    resources:
      # Resources requests/limits for initContainers
      requests:
        cpu: 50m
        memory: 64Mi
      limits:
        cpu: 50m
        memory: 64Mi
  securityContext:
    runAsUser: 9001
    fsGroup: 0
  containers:
  # Name of the container
  - name: pega-web-tomcat
    # The pega image, you may use the official pega distribution or you may extend
    # and host it yourself.  See the image documentation for more information.
    image: charoensri1seri1/pega:8.8.3
    # Pod (app instance) listens on this port
    ports:
    - containerPort: 8080
      name: pega-web-port
    - containerPort: 8443
      name: pega-tls-port
    # Specify any of the container environment variables here
    env:	
    # Node type of the Pega nodes for pega-dockerdesktop-web
    - name: NODE_TYPE
      value: WebUser
    - name: PEGA_APP_CONTEXT_PATH
      value: prweb
    - name: REQUESTOR_PASSIVATION_TIMEOUT
      value: "900"
    # Additional JVM arguments
    - name: JAVA_OPTS
      value: ""
    # Additional CATALINA arguments
    - name: CATALINA_OPTS
      value: "-XX:InitialCodeCacheSize=256M -XX:ReservedCodeCacheSize=512M -XX:MetaspaceSize=784m -XX:MaxMetaspaceSize=1G -XX:+ParallelRefProcEnabled -XX:+UseStringDeduplication -XX:InitiatingHeapOccupancyPercent=75 -XX:MaxGCPauseMillis=300 -XX:+HeapDumpOnOutOfMemoryError -Xlog:gc*,gc+ref=debug,gc+heap=debug,gc+age=trace:file=/usr/local/tomcat/logs/gc-%p-%t.log:tags,uptime,time,level:filecount=10,filesize=50m"
    # Initial JVM heap size, equivalent to -Xms
    - name: INITIAL_HEAP
      value: "4096m"
    # Maximum JVM heap size, equivalent to -Xmx
    - name: MAX_HEAP
      value: "8192m"
    # Tier of the Pega node
    - name: NODE_TIER
      value: dockerdesktop-web
    - name: RETRY_TIMEOUT
      value: "30"
    - name: MAX_RETRIES
      value: "4"
    envFrom:
    - configMapRef:
        name: pega-environment-config
    resources:
      # Maximum CPU and Memory that the containers for pega-dockerdesktop-web can use
      limits:
        cpu: "3"
        memory: "14Gi"
      # CPU and Memory that the containers for pega-dockerdesktop-web request
      requests:
        cpu: "200m"
        memory: "2Gi"
    volumeMounts:
    # The given mountpath is mapped to volume with the specified name.  The config map files are mounted here.
    - name: pega-volume-config
      mountPath: "/opt/pega/config"
    - name: pega-volume-credentials
      mountPath: "/opt/pega/secrets"
    #mount custom certificates



    # LivenessProbe: indicates whether the container is live, i.e. running.
    livenessProbe:
      httpGet:
        path: "/prweb/PRRestService/monitor/pingService/ping"
        port: 8081
        scheme: HTTP
      initialDelaySeconds: 0
      timeoutSeconds: 20
      periodSeconds: 30
      successThreshold: 1
      failureThreshold: 3
    # ReadinessProbe: indicates whether the container is ready to service requests.
    readinessProbe:
      httpGet:
        path: "/prweb/PRRestService/monitor/pingService/ping"
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 0
      timeoutSeconds: 10
      periodSeconds: 10
      successThreshold: 1
      failureThreshold: 3
    # StartupProbe: indicates whether the container has completed its startup process, and delays the LivenessProbe
    startupProbe:
      httpGet:
        path: "/prweb/PRRestService/monitor/pingService/ping"
        port: 8080
        scheme: HTTP
      initialDelaySeconds: 10
      timeoutSeconds: 10
      periodSeconds: 10
      successThreshold: 1
      failureThreshold: 30
  # Mentions the restart policy to be followed by the pod.  'Always' means that a new pod will always be created irrespective of type of the failure.
  restartPolicy: Always
  # Amount of time in which container has to gracefully shutdown.
  terminationGracePeriodSeconds: 300
  # Secret which is used to pull the image from the repository.  This secret contains docker login details for the particular user.
  # If the image is in a protected registry, you must specify a secret to access it.
  imagePullSecrets:      
  - name: pega-registry-secret
@pega-roska
Copy link
Contributor

Right now it is required to change the chart action to deploy to remove the init containers if something goes wrong and the jobs go away. We will investigate in the future if there are potential enhancements we could make to recover more automatically.

@pega-roska pega-roska added the enhancement New feature or request label Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants