Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Duplicate StatefulSet ControllerRevision, PerconaServerMongoDB status stucks in "Initializing" #1557

Open
Fr0s1 opened this issue May 26, 2024 · 2 comments
Labels

Comments

@Fr0s1
Copy link

Fr0s1 commented May 26, 2024

Report

The PerconaServerMongoDB status stucks in "Initializing" because the field "status.updatedReplicas" is always smaller than "status.replicas" which is caused by multiple ControllerRevision

image

image

More about the problem

When creating new MongoDB cluster from the Helm chart percona/psmdb-db in ArgoCD, there are two controller revisions of the Config Server and ReplicaSet StatefulSet,
image

kubectl get controllerrevision command result
image

The issue happens in the logic of "smart.go" function, the controller compares the StatefulSet field "status.updatedReplicas" with "status.replicas". If it's not equal, then the StatefulSet is not up to date
image

Log details
{"level":"info","ts":1716685730.240358,"msg":"StatefulSet is not up to date","controller":"psmdb-controller","object":{"name":"mongodb-cluster","namespace":"mongodb"},"namespace":"mongodb","name":"mongodb-cluster","reconcileID":"75a29a96-608c-45f4-ac87-e132bf180b29","sts":"mongodb-cluster-cfg"}

Steps to reproduce

  1. Install MongoDB Operator in "mongodb-operator" namespace in ArgoCD
    image

  2. Install MongoDB Helm in "mongodb" namespace in ArgoCD
    image

Versions

  1. Kubernetes: AWS EKS 1.29
  2. Operator: 1.16.0
  3. Database: 1.16.0

Anything else?

The issue does not happen with Operator version 1.15.4 and Database version 1.15.3

@Fr0s1 Fr0s1 added the bug label May 26, 2024
@Fr0s1
Copy link
Author

Fr0s1 commented May 26, 2024

Attach MongoDB DB Helm values

finalizers:
  ## Set this if you want that operator deletes the primary pod last
  - delete-psmdb-pods-in-order
  ## Set this if you want to delete database persistent volumes on cluster deletion
  - delete-psmdb-pvc

fullnameOverride: "mongodb-cluster"

unsafeFlags:
  tls: false
  replsetSize: false
  mongosSize: false
  terminationGracePeriod: false
  backupIfUnhealthy: true

multiCluster:
  enabled: false
  # DNSSuffix: svc.clusterset.local
updateStrategy: SmartUpdate 

upgradeOptions:
  versionServiceEndpoint: https://check.percona.com
  apply: disabled
  schedule: "0 2 * * *"
  setFCV: false

tls:
  mode: preferTLS
  # 90 days in hours
  certValidityDuration: 2160h
  allowInvalidCertificates: true
  issuerConf:
    name: letsencrypt-mongodb
    kind: ClusterIssuer
    group: cert-manager.io
  
pmm:
  enabled: true
  image:
    repository: percona/pmm-client
    tag: 2.41.2
  serverHost: monitoring-service

replsets:
  rs0:
    name: rs0
    size: 3
  
    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
    
    podDisruptionBudget:
      maxUnavailable: 1
  
    expose:
      enabled: false
      exposeType: ClusterIP
     
    resources:
      limits:
        cpu: "500m"
        memory: "1G"
      requests:
        cpu: "300m"
        memory: "0.5G"
    volumeSpec:
      pvc:
        storageClassName: database-standard
        accessModes: ["ReadWriteOnce"]
        resources:
          requests:
            storage: 3Gi
   
    nonvoting:
      enabled: false
      
      size: 3
      affinity:
        antiAffinityTopologyKey: "kubernetes.io/hostname"
    
      podDisruptionBudget:
        maxUnavailable: 1
      resources:
        limits:
          cpu: "300m"
          memory: "0.5G"
        requests:
          cpu: "300m"
          memory: "0.5G"
      volumeSpec:
        pvc:
          resources:
            requests:
              storage: 3Gi
    arbiter:
      enabled: false
      size: 1
      # serviceAccountName: percona-server-mongodb-operator
      affinity:
        antiAffinityTopologyKey: "kubernetes.io/hostname"

sharding:
  enabled: true
  balancer:
    enabled: true

  configrs:
    size: 3
   
    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
   
    podDisruptionBudget:
      maxUnavailable: 1
    expose:
      enabled: false
      exposeType: ClusterIP
  
    resources:
      limits:
        cpu: "300m"
        memory: "0.5G"
      requests:
        cpu: "300m"
        memory: "0.5G"
    volumeSpec:
      pvc:
        resources:
          requests:
            storage: 2Gi

  mongos:
    size: 2

    affinity:
      antiAffinityTopologyKey: "kubernetes.io/hostname"
     
    podDisruptionBudget:
      maxUnavailable: 1
    resources:
      limits:
        cpu: "300m"
        memory: "0.5G"
      requests:
        cpu: "300m"
        memory: "0.5G"
    expose:
      exposeType: ClusterIP
    
backup:
  enabled: true
  image:
    repository: percona/percona-backup-mongodb
    tag: 2.4.1
 
  storages:
    # minio:
    #   type: s3
    #   s3:
    #     bucket: MINIO-BACKUP-BUCKET-NAME-HERE
    #     region: us-east-1
    #     credentialsSecret: my-cluster-name-backup-minio
    #     endpointUrl: http://minio.psmdb.svc.cluster.local:9000/minio/
    #     prefix: ""
    #   azure-blob:
    #     type: azure
    #     azure:
    #       container: CONTAINER-NAME
    #       prefix: PREFIX-NAME
    #       endpointUrl: https://accountName.blob.core.windows.net
    #       credentialsSecret: SECRET-NAME
  pitr:
    enabled: true
    oplogOnly: true
   
  tasks: []

@lmerlas
Copy link

lmerlas commented Jul 18, 2024

I experience a very similar issue, only that I am not using ArgoCD at all.

I am deploying using Helm in an EKS cluster (k8s v1.30) and, although the correct number of replicas are created, the psmdb operator remains stuck in initialization state.

This issue has been introduced with operator version 1.16.x, while the versions 1.14 and 1.15 behaved as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants