-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
athenad memory leak #34079
Comments
It's hard to identify the exact issue, as the code is unclear and difficult to understand. However, multiple factors could be causing the resource leak. One possibility is that the athenad's implementation might cause garbage collector to accumulate objects over time, leading to memory issues during long runs. |
The following issues and fixes address potential resource leaks and long-term accumulation risks:
|
This is still happening after uploading files |
I see this after uploading rlogs:
|
The issue may not just be with the upload process—WebSocket instance growth could indicate deeper resource management problems. |
@sshane: Do we really need the It may not be directly related to this issue, but I recommend removing this thread until we have a more robust solution. |
The current athenad implementation lacks protections for concurrent data access in a multi-threaded environment. Besides PR #34084, which fixes race conditions for global variables, the |
@sshane: The current implementation does not preserve the order between received commands and their responses. Responses may be sent out of order. Commands are placed in the Is this behavior by design, or should it be considered an issue? |
The order of responses does not matter if handled by the backend server correctly since the JsonRPC spec handles this. The 'requester' sends a request with a unique ID which could just be a ns timestamp for example. When the 'responder' (athena in this case) responds, it will also send the ID in the payload. https://www.jsonrpc.org/specification#response_object. This is done automatically so its opaque if you just look at the athena code. Commas connect backend server should be doing the generation of pseudo unique ids and buffering the responses in a hashmap. Although as seen here https://github.com/commaai/connect/blob/79880f1203269eb145bc87863808e6cfca797ed1/src/actions/files.js#L147 |
Can't seem to reproduce this by re-uploading a large number of rlogs |
It was using 15% of memory, causing low memory events. I had not rebooted the device in a few days iirc and had uploaded many rlogs.
I tried to attach with memray, but instead the process just crashed so I didn't get any information about this. @deanlee could this be similar to the leak you found in athena previously?
The text was updated successfully, but these errors were encountered: