-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some Rooms Require Over 20GB Storage #21
Comments
So I doubled my plan to investigate this, strange results. It used ~21GB, and after finally getting it to finish, it wants to actually add rows (
That's definitely related to #7. I ran into a roadblock because most operations on postgres and using this tool on rooms with over 3 million states want to use well over 10GB. So now I'm running it on my remaining rooms with 80GB free, to see what happens. I'm still not sure if the issues I run into while running the state compressor are related to me being overly optimistic on how Postgres works, or a quirk with the tool itself. |
Another run that looks a lot like #7 to me:
Note: First run consumed over 12GB and crashed, second run (with larger disk) consumed ~11GB, then steadily decreased usage as it ran for several more minutes. Additionally the process took a couple minutes to exit after the final output. I think it's correct to assume I shouldn't add that to my DB? Edit:
Edit 2: For rooms over 9 million states, over 6.5GB RAM is consumed, and I hit #6. |
I'm wondering if the issues described here are symptoms of matrix-org/synapse#3364. |
@TheDiscordian can you count number of state groups for those problematic rooms manually, via SQL query:
for understand, does count of state groups from script equal to real count, or something goes wrong? For example, I have some rooms, that have 10+ millions of state groups:
And this is not effect of matrix-org/synapse#3364 issue, because of count is less than million:
|
I have successfully ran the script to optimize the one room with 64 182 319 of state events, count after compression is 8 655 766 rows (13.4%). |
FWIW with the new auto-compressor, I'm currently seeking settings that may mitigate this? I'm at over 50GB on some rooms, and it's wild paying for VPS space that's only used for maintenance sometimes. Running experiments rn, but it's slow even with NVMe drives for each test, so I doubt I'll have much to show for it. My DB is over 250GB last I looked : /. I think this is all used for a cache? If I could pick the drive, I could probably mitigate this (though I can't find an upper limit to these sizes...). BTW MurzNN, I believe I did run that query of yours in another issue and we talked there. I didn't mean to ignore you for nearly 2 years :'). |
Got into the same problem today. One room took over 24GB of space (the whole db is ~60GB), my postgresql partition ran out of space. Afterwards, the server never really worked again (synapse starts out with a few SELECT operations that never finish, and then everything related to the database just timeouts into oblivion). I would not recommend using this tool unless you have a virtually unlimited amount of disk space :c |
Describe the bug
On the "Fetching state from DB for room" step, storage space is continuously consumed until space is empty, errors, then frees the space. Might be related to #6 (maybe even a duplicate?), but I'm talking about storage space, not memory. I only have 14GB storage right now, but two of my rooms with only 110141 and 167096 state groups can't seem to have this tool successfully run, because I run out of storage.
To Reproduce
Honestly not sure how you'd reproduce it. I can't see anyone else with the issue, it happens when I run it on
!zTAqnOWiFuKTlnGOhq:matrix.thedisco.zone
and!tmgqjKkMXUbqUHECPV:matrix.thedisco.zone
, I don't know what those rooms are.Expected behavior
Same as when I run on any other room, consumes a bit of storage, then finishes normally.
VPS:
I have ~14GB storage free, and 3GB RAM available while running this tool.
Additional context
Command run:
./synapse-compress-state -t -o state-compressor.sql -p "host=localhost user=<redacted> password=<redacted> dbname=<redacted>" -r "!zTAqnOWiFuKTlnGOhq:matrix.thedisco.zone"
Error received:
It seems to be a postgresql error. So if this is an upstream issue I guess this can be closed. However it would be nice to understand why this happens on these rooms and not others. I'm currently running it on larger rooms, and I'm not even noticing storage being consumed, but that room wants to use over 14GB.
I'm still running this on several rooms, I'll update if I notice anything else related.
The text was updated successfully, but these errors were encountered: