Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix_: limit the maximum number of message hashes by query hash #5688

Closed
wants to merge 2 commits into from

Conversation

richard-ramos
Copy link
Member

Team, do you think this could be added to the release or is it too risky at this point?

The problem of not adding this code is that if the missing message verification is enabled there's no limit for the number of missing message hashes to request and that can increase the load of the storenodes while with this change we reduce the number of message hashes per query to 50.

The equivalent PR for develop branch will be created later once waku-org/go-waku#1190 is merged

@status-im-auto
Copy link
Member

status-im-auto commented Aug 9, 2024

Jenkins Builds

Commit #️⃣ Finished (UTC) Duration Platform Result
✔️ 1915ab9 #1 2024-08-09 19:51:20 ~2 min tests-rpc 📄log
✔️ 1915ab9 #1 2024-08-09 19:52:48 ~4 min linux 📦zip
✔️ 1915ab9 #1 2024-08-09 19:53:09 ~4 min ios 📦zip
✔️ 1915ab9 #1 2024-08-09 19:54:04 ~5 min android 📦aar
✔️ 1915ab9 #1 2024-08-09 20:34:47 ~46 min tests 📄log
34112a9 #2 2024-08-12 17:33:13 ~26 sec ios 📄log
34112a9 #2 2024-08-12 17:33:26 ~36 sec linux 📄log
✖️ 34112a9 #2 2024-08-12 17:33:56 ~1 min tests-rpc 📄log
✖️ 34112a9 #2 2024-08-12 17:34:06 ~1 min tests 📄log
34112a9 #2 2024-08-12 17:34:11 ~1 min android 📄log

Copy link
Contributor

@chaitanyaprem chaitanyaprem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@chaitanyaprem
Copy link
Contributor

Team, do you think this could be added to the release or is it too risky at this point?

I think we can add to release if we dogfood the changes and see no issues. Especially if it helps improve store node performance.

@richard-ramos
Copy link
Member Author

Sounds good, I'll create a dogfooding PR and hopefully we can do a quick dogfooding session early next week to get this merged ASAP! 🚀

Copy link
Contributor

@kaichaosun kaichaosun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@ilmotta ilmotta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Team, do you think this could be added to the release or is it too risky at this point?

@richard-ramos, code looks good, but before merging directly into the release branch we can ask the mobile QA team @status-im/mobile-qa to have a look (we just need to quickly create a mobile PR pointing to this branch).

It would be less risky if FetchHistory was covered by tests.

@richard-ramos
Copy link
Member Author

I have created the following test PRs:

Copy link
Collaborator

@igor-sirotin igor-sirotin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But isn't pagination already setting this limit? 🤔
And why is it better to do a few parallel requests (to the same store node), rather then sequential?

wakuv2/missing_messages.go Outdated Show resolved Hide resolved
@richard-ramos
Copy link
Member Author

But isn't pagination already setting this limit? 🤔

The query we do kinda looks like this:

SELECT * 
FROM messages
WHERE messageHash IN (msgHash1, msgHash2, .... msgHashN)
LIMIT 100

The pageLimit will limit the results after the filtering per messageHash is done. I.E. if you pass 300 message hashes as an IN condition, the database will attempt to find all those messages first before doing the limit, and will have to do the same for each page of the results if we pass a cursor. Since the messageHash is unique, it's more efficient to just find and return a smaller resultset, since it will not have to find all the records first to later just return a subset of them (It also means less load for the DB).


Regarding sequential vs parallel, sending smaller requests concurrently I believe it is the better approach, as it can significantly reduce the overall time needed to retrieve all the data. I'm 100% not sure if nwaku will really benefit from this change but some of the benefits with this approach are:

  • Reduced total latency: By sending requests in parallel, you can reduce the overall time to receive all responses. This is particularly important if each request has a significant latency (e.g., due to network delays or query processing time).
  • Better utilization of resources: nwaku uses chronos, so in theory it should handle multiple concurrent operations efficiently. We also have a connection pool in postgresql. By sending requests in parallel, we should be able to take advantage of the available resources (e.g., CPU, network bandwidth, available connections) to process multiple requests simultaneously.
  • Improved User Experience: users will experience faster response times as data is retrieved and processed more quickly.

@igor-sirotin
Copy link
Collaborator

The pageLimit will limit the results after the filtering per messageHash is done.

But... shouldn't we change the query on server (nwaku) side then, rather than modifying clients not to overload servers? 🤔
I mean, just limit the mshHash list before putting it into query.

@igor-sirotin
Copy link
Collaborator

some of the benefits with this approach are: ...

Ok, I guess it makes sense in this case 👍

Though, again, some of this points sounds like we're trying to make clients to use servers more efficient, while we could make servers smarter and work efficient even with dumb clients 😄 (e.g., to utilize postgresql connection pool, the SQL request could be split into multiple requests on server side).

@richard-ramos
Copy link
Member Author

Closing due to deciding to not go with this PR for 2.30 as described in status-im/status-mobile#21021 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants