Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep better track of memory #78

Open
msm-code opened this issue Apr 22, 2020 · 2 comments
Open

Keep better track of memory #78

msm-code opened this issue Apr 22, 2020 · 2 comments
Labels
level:hard This issue may be challenging priority:high Priority: high type:feature New feature

Comments

@msm-code
Copy link
Contributor

We should at least pretend that we keep track of memory usage.

Ursadb will happily consume all the RAM it can get. Granted, it tries to be lightweight, but that's sometimes hard when just the filelist is bigger than current RAM.

  • DB should have an idea how much ram it's allowed to use
  • Every command should try to estimate how much RAM it'll use, and "lock" this much RAM in the coordinator
  • Commands that can't execute due to RAM shortage should block (I think?)
@msm-code msm-code added level:hard This issue may be challenging priority:high Priority: high type:feature New feature labels Apr 22, 2020
@chivay
Copy link
Contributor

chivay commented Apr 22, 2020

DB should have an idea how much ram it's allowed to use

👍

Every command should try to estimate how much RAM it'll use, and "lock" this much RAM in the coordinator

Seems acceptable if we're abandoning the idea of worker threads living outside of the main process (which seems a bit limiting). Additionally, predicting and measuring the memory consumption isn't that trivial imo.

I would try to utilize mechanisms provided by the OS and move the workers into separate cgroup.
But I'm open to other ideas ;)

@msm-code
Copy link
Contributor Author

msm-code commented Apr 22, 2020

Additionally, predicting and measuring the memory consumption isn't that trivial imo. Far from trivial, so I'm not sure how to handle it. For example, it's almost impossible to predict exactly how much RAM query will need (unless we dump temporary results to disk, which will be pretty slow).

I would try to utilize mechanisms provided by the OS and move the workers into separate cgroup. makes sense as a last resort, but ideally I think db should reject new tasks if it thinks it'll cause OOM 🤔. But failing cleanly due to cgroup oom is still 10x better than taking the whole db down.

Edit also, in case of queries, we can dump temporary results to disk to save RAM. This will make us 100% resistant to OOM, but it may be too slow to do by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level:hard This issue may be challenging priority:high Priority: high type:feature New feature
Projects
None yet
Development

No branches or pull requests

2 participants