Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Yatai memory leak #509

Open
a-pichard opened this issue Mar 25, 2024 · 8 comments
Open

Yatai memory leak #509

a-pichard opened this issue Mar 25, 2024 · 8 comments

Comments

@a-pichard
Copy link

a-pichard commented Mar 25, 2024

Hi, i noticed that my yatai pod (running inside yatai-system namspace) kept getting evicted due to memory pressure on the node, but i don't think that running it on a bigger node would solve the issue, it looks like a memory leak to me

Screenshot 2024-03-25 at 11 53 30

Running yatai 1.1.13

Is it happening to anyone else here ?

@aytunc-tunay
Copy link

I've the same problem, even though I asked in the slack channel I couldn't get an answer.

I’m using image: quay.io/bentoml/yatai:1.1.13 but its keep getting oom killed in my cluster.

Screenshot 2024-03-29 at 18 40 44

@FogDong
Copy link
Member

FogDong commented Apr 2, 2024

Hi, can you provide the yatai version, also 1.1.13? @a-pichard

@a-pichard
Copy link
Author

Yes i am running yatai 1.1.13

@aytunc-tunay
Copy link

It's been 3 weeks, is there anything that you can give us to understand the reason and possibly how to fix ?

@FogDong
Copy link
Member

FogDong commented Apr 15, 2024

Sorry for the late reply. According to the release notes, I don't think it was introduced in 1.1.13, probably an older version, since the 1.1.13 version only includes a minor fix in helm chart.
If you can provide the version without the memory leak problem will be helpful to find the root cause.

@aytunc-tunay
Copy link

I downgraded to 1.1.11 and memory leak still continues. I checked the processes running inside of the container and this was the only one "/app/api-server serve -c /conf/config.yaml" where it consistently reaches the 3Gi memory limit and gets OOM killed, without any significant increase in workload. The application configuration and Kubernetes setup are standard, with memory limits set as expected. Could you please help identify what might be causing this memory usage spike?

@FogDong
Copy link
Member

FogDong commented Apr 21, 2024

/app/api-server serve is actually the entrypoint of yatai backend.
cc @yetone

@nrlulz
Copy link

nrlulz commented Sep 12, 2024

I have been seeing this too. It seems to have something to do with the version of yatai-deployment.

image

In the above graph, yatai-deployment was upgraded to 1.1.21 around 17:00, and downgraded back to 1.1.13 around 14:00. I have a 1GiB memory limit set. I will play with it some more and see if I can pin down the exact version that introduces the leak.

edit:
Yatai version is 1.1.13.

It looks like it is yatai-deployment 1.1.19 that introduces the leak.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants