You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We started noticing renewals of leases taking time and eventually erroring out with context deadline exceeded and rotation of Postgres credentials also failed.
This is happening intermittently, after a couple of minutes later the renewals go through and new credentials are also generated.
The request is canceled when renewing the lease with this error:
Run vault lease renew postgres/creds/my-role/j2hvxTOF2Kufh3Be0rMa9fX5
See error failed to read lease entry postgres/creds/my-role/j2hvxTOF2Kufh3Be0rMa9fX5: context canceled
Expected behavior
The request should go through, renewals and rotation should work without any timeouts.
Environment:
Vault Server Version:
Key Value
--- -----
Recovery Seal Type shamir
Initialized true
Sealed false
Total Recovery Shares 3
Threshold 2
Version 1.10.3
Storage Type raft
Cluster Name demo-vault
Cluster ID <redacted>
HA Enabled true
HA Cluster https://vault-2.vault-internal:8201
HA Mode active
Active Since 2024-12-18T15:19:51.305456306Z
Raft Committed Index 6166547
Raft Applied Index 6166547
We have checked the RDS instances to see if they are overloaded but they are not. The instance types range from small to large. The maximum number of connections is not reached.
We analyzed the IOPS of the underlying SDD disk used for Raft storage and we did not find any anomalies. The Raft DB size is around ~850 MB.
We verified the K8S cluster as well and there are no bottlenecks when communicating with the RDS instances.
We are also able to connect to the RDS instances from the Vault pods.
We also increased CPU and memory requests to eliminate any resource crunch
We have ~350 connections and ~500 roles in Postgres engine
We have also set VAULT_CLIENT_TIMEOUT to 300s in all Vault pods
The text was updated successfully, but these errors were encountered:
jsanant
changed the title
Vault postgres engine is slow when renewing leases and rotating credentials
Vault postgres engine is slow when renewing leases and generating new credentials
Dec 19, 2024
Describe the bug
We started noticing renewals of leases taking time and eventually erroring out with
context deadline exceeded
and rotation of Postgres credentials also failed.This is happening intermittently, after a couple of minutes later the renewals go through and new credentials are also generated.
The request is canceled when renewing the lease with this error:
To Reproduce
Steps to reproduce the behavior:
vault write postgres/creds/my-role
vault lease renew postgres/creds/my-role/j2hvxTOF2Kufh3Be0rMa9fX5
failed to read lease entry postgres/creds/my-role/j2hvxTOF2Kufh3Be0rMa9fX5: context canceled
Expected behavior
The request should go through, renewals and rotation should work without any timeouts.
Environment:
Vault server configuration file(s):
Additional context
Things we have already validated:
VAULT_CLIENT_TIMEOUT
to300s
in all Vault podsThe text was updated successfully, but these errors were encountered: