Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pg instance failure due to OOM #1091

Open
jterzis opened this issue Sep 17, 2024 · 1 comment
Open

pg instance failure due to OOM #1091

jterzis opened this issue Sep 17, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@jterzis
Copy link
Contributor

jterzis commented Sep 17, 2024

Describe the bug
An operator's cloud pg instance went down ostensibly due to OOM error. During the incident time, a connection spike in node to pg connections was seen. Unclear whether stream node created connections due to organic traffic from clients or as a response to pg service interruption from OOM error. Confirm nodes do not DOS pg with new connections on pg failures.

To Reproduce
Steps to reproduce the behavior:

Expected behavior
Confirm stream nodes do not create new pg connections hyperactively when pg service interrupts or any other code paths in the stream node that create inorganic pg connections (uncorrelated to actual client requests).

Screenshots
telegram-cloud-photo-size-1-5138982082981244551-y

Screenshot 2024-09-17 at 2 52 57 PM

Logs

  • stream node syslogs
  • postgres logs
  • blockchain explorer transactions

Additional context
operator was running 30gb memory single pg instance v14 against 4 mainnet nodes. After OOM error, upgraded to 100GB memory.

@jterzis jterzis added bug Something isn't working triage Triage during next triage session labels Sep 17, 2024
@sergekh2 sergekh2 removed the triage Triage during next triage session label Sep 23, 2024
@sergekh2
Copy link
Contributor

lets wait for upgrade to pg 16 and then proceed if still the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: No status
Development

No branches or pull requests

3 participants