Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues in production - memory exhaustion, segfaults #68

Closed
mvolz opened this issue Dec 21, 2018 · 5 comments
Closed

Issues in production - memory exhaustion, segfaults #68

mvolz opened this issue Dec 21, 2018 · 5 comments

Comments

@mvolz
Copy link
Contributor

mvolz commented Dec 21, 2018

We've been having some issues with this in production, namely memory exhaustion and also segfaults. Unfortunately I can't provide much more information /details than that at present. Have you been having similar issues?

We had memory exhaustion issues with the older version as well, it just filled up more slowly.

Probably addressing issue #2 would be a start helping to diagnose this.

@mvolz mvolz changed the title Issues in production Issues in production - memory exhaustion, segfaults Dec 21, 2018
@dstillman
Copy link
Member

dstillman commented Dec 21, 2018

We're actually running it in AWS Lambda, so our environment is pretty different.

Are you on the latest version? What does your environment look like (Node version, available memory)? Can you tell how long from start to OOM, and approximately how many requests it's fulfilling in that time?

We'll look into whether we can reproduce any memory leaks, but it'd be good to make sure we're looking in a similar environment.

@dstillman
Copy link
Member

dstillman commented Dec 22, 2018

We've been able to reproduce OOM conditions by pointing translation-server at very large files (e.g., ISOs, or multiple concurrent large PDFs), so that might be what you're seeing. We have a fix in progress that should be ready shortly.

@dstillman
Copy link
Member

dstillman commented Dec 23, 2018

OK, we've limited the upstream size to 50 MB (which is probably much bigger than it needs to be, so we could consider lowering that further), and we now reject (and avoid memory-intensive parsing attempts for) documents that aren't HTML or XML. Those will both now return 400.

There may be other things we can do, but this will hopefully address most of the problems you were seeing.

@mvolz
Copy link
Contributor Author

mvolz commented Dec 23, 2018

Great, we'll give it a try after the holiday :).

@mvolz
Copy link
Contributor Author

mvolz commented Jan 21, 2019

Looking much better now, thank you so much! Loading those large pdfs that was really knocking things over. I'm closing this for now :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants