-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A lot of 150/2 transactions in the txpool causes memory spike / OOM #9317
Comments
I had a similar issue which happened now the second time after I upgraded from v0.18.3.2 to v0.18.3.3 (looks more like issue #9315) here is what monerod shows: Hope fully this helps. Something in the latest version seems to cause this issue. Greetings! |
This is not related to the latest version, the reason why this only happened in the last week is because before that no one ever added a ton of 150/2 transactions into the txpool. |
What exactly are 150/2 transactions? Can you point me to any documentation? Thanks! |
150/2 means transactions with 150 inputs and 2 outputs. It's basically someone consolidating a lot of inputs. Such transactions are large and take a while to verify. Someone has transacted a lot of these at the same time which caused the tx pool to fill up. |
when you say 'takes time to verify' would a server being used as well for mining benefit from leaving a few threads extra available for such transactions spikes? Or is the CPU not really the bottleneck but rather RAM or SSD? interesting to see is that a manual restart of monerod always works without any problems. it syncs back up and runs as nothing ever happened. |
I just checked my node today and it reported being 2 months ahead after the issue I had that was linked here.
This is the output of diff
|
@Column01 you can ignore that message, it means a different peer claimed that 3097104 is the correct height which isn't true. Your node doesn't have any issues. |
Leaving a couple threads available is recommended for good node performance, but I don't think it would have helped in this specific case.
Nothing built in but you can use for example systemd for this. |
Some people reported same issue on Reddit, copy & pasting to here:
|
That is very interesting indeed. I do not think this is a coincidence, then |
I will adjust the monerod process priority based on this documentation and see what happens. https://www.baeldung.com/linux/memory-overcommitment-oom-killer |
Most of my processes are running with an oom_score of '666' I adjusted the oom_score_adj to -700 (sudo choom -p MONEROD_PROCESSID -n 700) which made it the number 2 in line after the unkillable processes with an oom_score of 0. If it gets killed now with 32GB of RAM we have a much bigger issue concerning this attack vector. I still cannot grasp how so much memory can be used in this case. Can anyone point me to some simple math how we get in the 10+GB range of memory usage? |
This could be because of how txs are relayed. When we receive txs in the fluff stage, we add them to our fluff queue and set a timer on each connection except the one that sent the txs to us. Then when the timer is up we make a message and the txs get sent on that connection: monero/src/cryptonote_protocol/levin_notify.cpp Lines 396 to 400 in c821478
monero/src/cryptonote_protocol/levin_notify.cpp Lines 203 to 207 in c821478
So it seems for every tx thats broadcasted the bytes of the tx are copied for each connection that receives the tx. So if we have 96 connections, then each tx will be fluffed to 95 peers (1 of them sent the tx to us). It's easy to see how this can add up: 10GB / 95 = ~108MB which means we only need around 108MB of txs for this to potentially use 10GB with 96 connections (I don't know how big the tx pool was when the crashes happened). It's more nuanced than I laid out there, each connection has a randomized timer so not all 95 will fluff at once for example and I haven't actually tested this so I might be missing something. @vtnerd what do you think? |
@Boog900 The txpool reached around 100MB, but those transactions got propagated over the time of like an hour, they didn't get submitted all at once. |
[1842267.939549] Out of memory: Killed process 37137 (monerod) total-vm:292086252kB, anon-rss:57011624kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:507476kB oom_score_adj:0 How much memory do I need? sheesh. : - ) |
Do I see correct that your VM killed the process with 292GB available? Or did monerod use that much memory? Obviously seems to be a more widespread issue. Is there anything I can do to help? Logging certain processes for example which helps to isolate the problem or helps to find a fix? |
currently I am running around 40-60(out) and 100-150(in). monerod managed to get through 2 failed 'parse transaction from blob' which before was often the trigger to a OOM kill. see the status below: 2024-05-06 21:42:51.443 I difficulty: 271257517664 |
"parse transaction from blob" is unrelated |
I had 64GB RAM in a VM, monerod was running under Docker, got OOM, docker container wasn't set to auto restart. Now it is, so at least when it crashes it can restart itself. |
@selsta I had an unusually low number of incoming connections during the May 2 consolidation flood, which couldn't be helped even with multiple restarts and a good number of while stalled, my txpool was empty all the time and the best known height alternated between my local height and the actual height I saw on block explorers. |
@chaserene What about the amount of incoming and outgoing connections before your node "crashed" / got killed for the first time, not afterwards? |
Is limiting in/out peers the best potential recommendation for node operators at this time? |
@spackle-xmr limiting in / out is currently a theory, it hasn't been confirmed to work yet.
I have seen that behaviour, yes. I had troubles getting incoming connections. |
Ah ok, transactions are periodically re-broadcasted after being in the txpool for too long, so this could still happen if a lot of txs re-broadcasts line up.
I mentioned this is -dev a while ago but I'll put it here again for visibility. I think this is also because of how txs are relayed. If you receive a tx, add it to your fluff queue, drop the tx from the txpool, then receive it again, it is added to the fluff queue a second time. When the fluff timer fires we would then broadcast a message with the same tx twice causing the peer to disconnect. I have tested this and I have managed to make monerod broadcast a message with the same tx twice. It makes sense that incoming connections are effected more as they have a longer average fluff timer, 5 seconds instead of 2.5 seconds for outbound. Longer time to receive the tx again and add it to the queue. |
The txpool was empty again after around 12h, would you hat cause re-broadcasts? Also any ideas to improve efficiency here?
Would it be possible to add some check to the fluff queue to disallow duplicate transactions?
I had troubles keeping both outgoing and incoming connections, I'm not sure if incoming had more issues but what you are saying seems plausible. |
sorry, I don't know what you mean.
Yes, changing how txs are broadcasted around the network. Instead of sending the whole tx blob to every connection (except the one that sent the tx to us first) send the tx hash and the peer can send a request for that tx if it doesn't already have it. This would require a couple new p2p messages, adding support for these can be done before the changes to how txs are broadcasted, which is how I would recommend doing it. I am planning to make a proposal for this in a separate issue as I am pretty sure this would significantly reduce the amount of bandwidth used for nodes but I want to test the amount of bandwidth wasted first to see just how much we can save before I do. For people interested, the way I am planning to test to see how much bandwidth is wasted is by recording data sent and received by monerod in a fully synced state. Then for each link I will monitor the txs sent along it, if the same tx is sent twice (in either direction), both the tx messages will count as wasted as it means the first message of the 2 was not the first time the node received that tx.
Probably but the queue is kept as raw bytes so we would need to keep a separate list of tx ids in the queue or compare the raw bytes of every tx in the queue to the one we are about to broadcast. If we adopted the new way to broadcast then this would be a lot easier as the queue would be tx ids not raw bytes. |
Sorry I mistyped and didn't proofread. I wanted to ask if transactions are typically re-broadcasted within the first 12h of being in the txpool, because that's usually the timeframe of the txpool clearing up again.
That sounds like a good idea! Also regarding the drop peer issue with |
Yes, txs are rebroadcasted after 5 mins, then 10, then 15, so after every broadcast the wait between them increases 5 mins, up to 4 hours where it is capped.
If people have a need to have an extremely low txpool we could temporarily, but we would need people to update to fix this issue, not just the people actually having the issue. I think it makes more sense to just recommend people not set the txpool weight too low, until we fix monerod sending duplicate txs. |
@selsta I don't remember how many connections my monerod had before I first restarted it, but I checked my logs and I see quite a few of the following around that time:
|
@chaserene I'm not sure if your issue is the same as the issue I opened here. I did not see any of these messages on my nodes. I only had issues with high RAM usage. What kind of hardware do you have? What daemon config do you use? |
Is it possible to trigger this (the bolded part) remotely, or do you have to manually drop the tx from the txpool? |
@vtnerd for this to happen there has to be a large txpool backlog (for example 1MB) and the node has to be set to a smaller txpool limit with |
I have noticed this a couple times now that monerod gets OOMd when there are a lot of 150/2 transactions in the txpool. This happened on VPS with both 8GB and 16GB RAM.
The text was updated successfully, but these errors were encountered: