-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Milestone] Waku Network Can Support 10K Users #12
Comments
tagging Testing team: @AlbertoSoutullo, @0xFugue, @Daimakaimura |
Achieving network requirements: tasks and ownership
1. Verify scaling target requirementsUnderstand the expected:
for 10K community users.
Tracked in: ??
2. Community sharding planSharding strategy for Waku relay in general and Status Communities specifically. This plan will consider short term and longer term strategies. Tracked in: vacp2p/research#154
3. Simple Waku Relay DoS mitigationStrategy and implementation to protect relay and store against simple DoS attack vectors. This item is set out in more detail in @kaiserd's Secure Scaling Roadmap. Tracked in: vacp2p/research#164
4. Scalable storage: nwaku archive PostgreSQL implementationAlready part of #8 but repeated here for completeness. Tracked in: #4
5. Scalable storage: deterministic message IDTracked in: vacp2p/rfc#563
6. Scalable storage: testing store at scaleBasic testing to see that PostgreSQL implementation works at expected message and query rates. (Note this is in addition to simulation with Kurtosis). Tracked in: ?? 7. Filter and lightpush improvementsRevising the RFCs and implementations in nwaku and go-waku. Already part of #8 but repeated here for completeness. Tracked in: #5
8. Peer management strategyRFC for basic peer management strategy and implementations in nwaku and go-waku. Tracked in: waku-org/nwaku#1353
9. Combine into comprehensive scaling strategyThis can be seen as the final goal for all the moving parts and separate tasks listed above. Output will likely take the form of one or more Best Current Practices RFCs that focus on the Status 10K use case. It will bring together the short term strategies for sharding, DoS mitigation, bootstrapping, discovery and store configuration. It may include suggestions on when to use lightpush and filter rather than relay. Tracked in: vacp2p/research#165
10. Targeted dogfoodingThis is in addition to simulation with Kurtosis. Individual owners of each task will be responsible for testing and dogfooding their strategies/features. This task ensures that we have considered each item for targeted network testing, including:
Tracked in: ?? 11. New
|
Achieving other requirements: tasks and breakdown1. Wakurtosis: first network testThis is described in vacp2p/wakurtosis#7 It covers testing the scalability of the relay protocol, specifically measuring:
Owner:
2. Wakurtosis: analyze first test resultsThis step will either confirm our (positive) assumptions about relay scalability or highlight bottlenecks/bugs in the protocol or implementations, which must be addressed and considered in the overall network roadmap. Owners:
4. Wakurtosis: plan next testsThis is a collaborative task flowing from the results of the first test to refine the simulation(s) and plan the next, most useful tests. Owners:
5. Community Protocol: move to Vac RFC repoThis is an administrative step. It may require updating the RFC to match the latest implementation, moving sections around, etc. Owner:
6. Community Protocol: review protocolsGrasping the content of each protocol and how it maps to real-world Waku network traffic. This is potentially an involved task, so the scope should be minimized for this MVP. This relates to Owner:
7. Nwaku hardening: Wakurtosis sandbox machineProvisioning a performant machine(s) which the dev team can use for sandbox testing features using ad-hoc Wakurtosis deployments. Owners:
8. Nwaku hardening: Wakurtosis integration testingIntegration test environment for nwaku. Most likely it will take the form of a pipeline that deploys a Wakurtosis network topology and runs a series of scripted integration tests for nwaku. Owners:
9. Nwaku hardening: release automationAutomated release pipeline for nwaku that builds a release, compile release notes and publish release binaries and tagged docker image for most common OSs/architectures. Tracked in: waku-org/nwaku#611
10. Fleet ownership: set requirementsCreate a document that summarizes all the common tasks that a fleet owner generally has to do, including deployment, monitoring and debugging. This will also allow us to communicate to other platforms planning on deploying their own Waku fleets what they need to consider. The document should include a section on what Status fleet ownership specifically entails, including a procedure to log and escalate bugs/network anomalies. Owner: @jm-clius 11. Fleet ownership: trainingBased on the requirements determined above, determine who will take ownership of the Status fleets and schedule training sessions. Owner:
|
The also known as Message Unique ID initiative progress is tracked in the following issue: waku-org/nwaku#1914 |
Thoughts on current status:
Several discussions have happen. outputs I am aware of are: @jm-clius @richard-ramos did we have more to this?
This can be closed as static sharding was delivered. The quoted issue also tracks for 1mil.
This needs clean-up. Implementation of MUID to avoid dupe in store is done. Which was the main reason to do it for 10k. Then, MUID is possibly going to be used for Distributed store. @jm-clius please confirm
@jm-clius were we thinking DST simulation for this?
@jm-clius this seems done. Not sure if we tracked an output somewhere?
I suggest to descope this from Waku work. By delivering this milestone we enable Status to integrate Waku tech and start dogfooding. We are tracking hardening of Waku protocols as part of waku-org/research#3 with 2.1
Done. What issue tracked the work/output? @jm-clius
Last remaining task. Are we tracking somewhere @jm-clius ?
The other last remaining task. Are we tracking somewhere @jm-clius ? |
Thanks for revising, @fryorcraken. See my comments below.
Afaik many of the suggestions have been implemented or are in the process of being implemented, also in status-go. @richard-ramos may have better idea of current status. Perhaps the work that's being done in status-go should be tracked there, which would mean the Waku side can be closed?
I agree.
Yes, I would close vacp2p/rfc#563 as the only issue really needed for the 10K milestone. We also don't need to do anything else for the 1 mill milestone, but we can keep #9 open to track the work that would be necessary for the distributed store.
Initially, yes. But I think a reasonable step for the 10K epic would be (a) dogfooding and (b) local stress-testing of postgresql.
Yes, I've gone ahead and closed the issue. The output here was just moving the RFCs to vac repo and revising them.
Main tracking issue was: #15 which I think can just be closed. There were also tracking issues in nwaku (and probably go-waku/js-waku).
No, the first fleet that can be used for initial tests/dogfooding is tracked here: status-im/infra-waku#1 Since this fleet has been deployed, this issue can probably be closed. This is not quite a staging fleet for Status yet, which I'll link to the issue I create for the Status fleet requirements below.
It is now: #61 Not a very detailed issue, but should do the trick. :) |
I think suggestions from: vacp2p/research#177 have not been implemented, or I could not find them on status-go code. |
Weekly Update All software has been delivered. Pending items are:
|
Monthly Update Staging fleet for Status (static sharding + Postgres) has been defined and handed over to infra: waku-org/nwaku#1914 |
1k nodes simulation blogpost: vacp2p/vac.dev#123 |
Weekly Update
|
Weekly Update
|
Weekly Update
|
Weekly Update
|
Weekly Update
|
Weekly Update
|
We will run one more week of internal dogfooding of static sharding + PostgreSQL in Status Communities. The go-waku and waku chat sdk team will continue to support Status with their integration of Waku v2 but no major effort is scheduled in term of software development and testing. |
Weekly Update
|
#97 is now done. Status QA is proceeding with testing. |
Priority Tracks: Secure Scalability
Due date: 31 May 2023
Milestone: https://github.com/waku-org/pm/milestone/5
Summary
Tasks / Epics
WakuMessage
bytes vacp2p/rfc#563 Waku message UID #9 (latter issue tracks work for distributed store, not part of 10k or 1mil milestones)6. Scalable Storage: testing store at scale Roadmap(DST) vacp2p/research#191 (comment)Tracked as part of Configure fleet max connections status-im/infra-nim-waku#31Extracted questions
Network requirements
1. Message Delivery and Sharding
Assumptions:
2. Discovery
Assumptions:
3. Bootstrapping
Assumptions:
4. Store nodes (Waku Archive)
Assumptions:
5. Security:
Assumptions:
Other requirements
1. Kurtosis network testing
A simulation framework and initial set of tests that can approximate:
in such a way to prove the viability of any scaling design proposed to achieve the Network Requirements
2. Community Protocol hardening
The Community Chat Protocols specifications are moved to Vac RFC repo.
3. Nwaku integration testing
Nwaku requires integration testing and automated regression testing for releases to improve trust in stability of each release.
4. Fleet ownership
Ownership for infrastructure provided to Status communities should be established. This may require training and transfer of responsibilities which mostly lies de facto within the nwaku team.
Fleet ownership comprises the responsibility for:
Initial work
The requirements above will lead to a design and task breakdown. Roughly the order of work:
Ownership for all three items below is shared between Vac, Waku and Status teams:
(1) Agree on requirements above as the complete and minimal set to achieve the 10K scaling goal.
(2) A viable, KISS network design adhering to "Network requirements"
(3) Task breakdown of each item and ownership assignment
The text was updated successfully, but these errors were encountered: