Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Waku fleets: separate pubsub topic for development #13

Closed
jm-clius opened this issue Jan 30, 2023 · 18 comments · Fixed by status-im/infra-nim-waku#66
Closed

Waku fleets: separate pubsub topic for development #13

jm-clius opened this issue Jan 30, 2023 · 18 comments · Fixed by status-im/infra-nim-waku#66

Comments

@jm-clius
Copy link

Note: this issue is to discuss from Waku Product POV. Once agreed it can probably be moved to an infra repo.

Problem

Dogfooding of Status Communities has recently started. Since Status currently uses the default Waku pubsub topic:

  • their traffic is added to the default network, used by most operators/testers of Waku
  • Status clients are discoverable and connected to via the default discovery method (discv5)

The Waku fleets (wakuv2.prod and wakuv2.test) are currently only subscribed to this same default pubsub topic. Everyone using the default pubsub topic plus connecting to one of the Waku fleets will therefore be sharing (1) traffic and (2) infrastructure with the Status community dogfooding.

This is potentially an issue for:
(1) other operators using the default network and who do not want to deal with Status community artefacts on the same network
(2) Status community dogfooding being affected by Waku testers/operators injecting traffic and testing on the same default network
(3) testers wanting to tryout Waku who do not have a "clean" network to bootstrap to (e.g. during workshop planned for ETHDenver)

Issues reported

Some issues reported that could be caused by Status Community traffic on the default network:

  • duplicate messages received by Filter clients: slow Status nodes could experience such a large latency before relaying messages that they fall outside of the gossipsub retransmission window of neighbouring peers, which results in the same message being relayed and published to filter clients more than once.

Longer term solution

Vac Secure Messaging is currently working with Status app to:

  1. develop a static sharding strategy to stop using the default pubsub topic (which solves traffic and most infrastructure sharing)
  2. add a network/shard dimension to the discv5 ENR to allow differentiating discovered nodes

Interim suggestion (named shard)

Subscribe Waku fleets to one more pubsub topic for developers/operators/testers to use and remain unaffected on gossipsub level by Status traffic and vice versa. This pubsub topic could for example be /waku/2/dev-waku/proto. Fleet nodes would remain subscribed to both this dev topic and the default pubsub topic. However, Waku nodes could choose to subscribe to /waku/2/dev-waku/proto only and be separated from those (including Status Communities) on the default network.

What this does not solve:

Separating out the discv5 layer and differentiating networks/pubsub topics on discovery level. This would require the static rshard dimension which is being developed for the longer term use case. Nodes subscribed to the dev network will still discover and attempt connection to nodes in the default network - however, they will only form a mesh around other peers in the dev network.

Why not simply have a "separate fleet"?

Separate fleet does not translate to separate network. Since we don't have a discv5 mechanism yet to differentiate networks/shards on the discovery level, fleets will likely still discover and connect to each other until this is in place.

cc @alrevuelta for ETHDenver requirements

@fryorcraken fryorcraken added this to Waku Jan 30, 2023
@alrevuelta
Copy link
Collaborator

Thanks for the issue!

Regarding ethdenver requirements:
Which also applies to other workshops and demos. I would say we need the following:

    1. Set of at least 6 nodes supporting all protocols, which is the minimum amount we need for gossipsub.
    1. All peers should be dialable from the outside, and we should know their multi addresses and ports.
    1. All peers should be encoded in a dns disc url, so that it's easy to set them as bootsrap with --dns-discovery-url.
    1. Traffic in this network should ideally be low so that it works fine in conferences with low internet connection. Ofc we can't control people from spamming here, but there is no incentive to spam in this network.
    1. These fleet nodes should ideally be stable so that we can run demos using them. Problems: i) enabling all protocols might be an issue until they are stabilized and ii) right now our test fleet restart in every merge to master.

A quick solution for ethdenver would be to use wakuv2.test and subscribe them to a different gossipsub topic. With this, even if they end up connected to status fleet, we won't have all traffic present in the main topic, which fixes 4). Note though that since peer discovery nor peer management knows of topics, we can end up connected to eg 20 peers, but none of them support the topic that we are interested in, and afaik we won't know it.

@fryorcraken
Copy link
Contributor

duplicate messages received by Filter clients: slow Status nodes could experience such a large latency before relaying messages that they fall outside of the gossipsub retransmission window of neighbouring peers, which results in the same message being relayed and published to filter clients more than once.

Would we want to use #9 to avoid the dupe messages in filter? Should I just create an issue for the dupe message and we can track there? We are seeing this problem in js-waku examples.

I agree for a separate pubsub topic for development purposes.

In js-waku, we have a defaultBootstrap option to enable developers to connect to the Waku network.

This is especially useful in the context of hackathon, workshop, PoC, so that developer can get going and have a reliable experience out-of-the-box.

We currently connect to the wakuv2.prod fleet when such option is enabled.
Similarly, if no pubsub topic is specified, js-waku uses the default pubsub topic.

Which such options, developer could use the bootstrap option whether they were doing a hackathon, PoC or promoting software to prod.

Now, by introducing a different pubsub topic for development, it creates some friction: when they want to promote their software to prod, they'll have to specify the "prod" pubsub topic. I think that's fine. Thoughts @hackyguru ?

A quick solution for ethdenver would be to use wakuv2.test and subscribe them to a different gossipsub topic.

Related to my comment above, js-waku default bootstrap option connects to the wakuv2,prod fleet by default.
If only the wakuv2.test fleet subscrives to the dev pubsub topic, then we'll have to change default values in js-waku.

For stability reason and to ensure that developer's first experience with Waku is best, I would suggest that we subscrive the wakuv2.prod fleet to the dev topic too.

Set of at least 6 nodes supporting all protocols, which is the minimum amount we need for gossipsub.

Recommended is 6, lower bound is 4: https://rfc.vac.dev/spec/29/

These fleet nodes should ideally be stable so that we can run demos using them

That's why I think it should be wakuv2.prod fleet, not test.

@alrevuelta
Copy link
Collaborator

That's why I think it should be wakuv2.prod fleet, not test.

Sure, that's fine. And way better indeed, since wakuv2.prod does not restart on every merge to master. Only thing left for ethdenver would be to have a separate topic to avoid having lots of traffic (since that fleet is connected to status.prod right now). Understand the default gossipsub topic is hardcoded, we can discuss this in our brainstorming session.

I agree for a separate pubsub topic for development purposes.

I brought this up a few weeks ago with @jm-clius but never really gave a detailed explanation of my idea. I think in the medium term (since it shouldn't be that difficult to implement), we should have separate and segregated environments (not just pubsub topics).

A "production" (or mainnet) and a "staging" (testnet) one. Similar to what blockchains have (mainnet vs testnet). Over the longer term when we add (des)incentivization these networks will perhaps relate to the existing Ethereum mainnet and (Goerli most likely), but it's too early for this, leaving that aside by now. So I think we need:

  • A production (mainnet) network that is used by different apps, well maintained, where we deploy only stable releases. It's the default network where operators are, and where eventually there is something at stake to avoid spam (reputation or value) and some incentive (value). Both value and reputation are "real", meaning that rational agents would want to keep them. Nothing new here.
  • A testnet, where experimental features are deployed before reaching mainnet. This testnet is used by apps for development purposes as the previous step to production. Value and reputation here are "fake".

Reasons why I think we need separate networks (and not just a dev-topic)

  • If we just have a dev-topic, both networks will be sharing peers and connecting to them regardless of the network they belong to. Ofc peers in the dev topic will only gossip that topic but still, I believe we need to further separate them.
  • As @fryorcraken says, "upgrading" from dev to prod would require changing the topic. Not that bad if its just one topic, but what if multiple ones are being used?
  • When sharding comes, with dev-topics we won't be able to have a prod/test mirror. I mean, if we have 64 shards in production, we should have 64 shards in testing. Same names, same everything. With dev-topics we cant have that, unless we duplicate the number of shards and append the shard-test-0 prefix, but things starts going crazy here.
  • Messages should be treated differently and not mixed across different networks. Too early for this but if I am incentivized for relaying a message, it should be a mainnet message!
  • We may want to attack our own network from different sides, so having a totally isolated testnet for this would be also safer. We don't want that an attack ends up affecting mainnet.

A non-exhaustive list of how to achieve this:

  • Add a field with network-id to the message.
  • Add a flag with network-id to each client implementation (go, waku, js)
  • Add checks, if !=network-id the message is not relayed nor stored.
  • Add network-id key to the ENR. If I discover a peer that != my network-id I skip connecting/storing it.
  • Define the network-id, 0x01 mainnet? 0x02 testnet?

What do you think? Apologies if this derails from the main topic, can move it to another thread. "Waku fleets: separate networks production/development".

@jm-clius
Copy link
Author

Some responses :)

A quick solution for ethdenver would be to use wakuv2.test and subscribe them to a different gossipsub topic

While I agree with doing this on wakuv2.prod, I see no reason not subscribe wakuv2.test to that topic as well. The demo can point people to wakuv2.prod.

Note though that since peer discovery nor peer management knows of topics, we can end up connected to eg 20 peers, but none of them support the topic that we are interested in, and afaik we won't know it.

Indeed! Gossipsub should take care of building the mesh as long as the total number of peers is still relatively low and we can discover and maintain connectivity to "all peers", but this will become an issue in future for multiple networks/shards.

A "production" (or mainnet) and a "staging" (testnet) one. Similar to what blockchains have (mainnet vs testnet).

I roughly agree with your reasoning here for separate networks in the medium term. What I think would be a sufficient definition for what you term a separate network is one which is:

  • separable on routing level (only traffic belonging to that network being routed, processed and services provided)
  • separable on discovery level (only nodes belonging to that network being connected to after discovery)

I think the static sharding strategy would provide us with this functionality? We could e.g. have a static shard formally allocated for a mainnet and one or more shards for testnet(s)? This would imply that network-id as you describe above would be the rshard ID. The plan is already to have this be a dimension in discovery to skip connecting/storing it. Furthermore this allows gossipsub rules to determine routing and mesh creation.

@jm-clius
Copy link
Author

Linking relevant issue: vacp2p/research#174

@alrevuelta
Copy link
Collaborator

While I agree with doing this on wakuv2.prod, I see no reason not subscribe wakuv2.test to that topic as well. The demo can point people to wakuv2.prod.

Yes agree. Disregard my comment :)

I think the static sharding strategy would provide us with this functionality? We could e.g. have a static shard formally allocated for a mainnet and one or more shards for testnet(s)? This would imply that network-id as you describe above would be the rshard ID. The plan is already to have this be a dimension in discovery to skip connecting/storing it. Furthermore this allows gossipsub rules to determine routing and mesh creation.

I'm not sure it would be a good idea to have "test shards", since imho mainnet/testnet should be a mirror/copy of each other to ensure that what works in one, works in the other one with just a minor tweak (change in network-id). For example, what will happen when we have automatic sharding? How will the "automatic shards" work in testnet if the testing environment is already a shard? Will we have 65535 "production shards" and then other 65535 shards in the same network but with different name? Seems odd to me, but I'm heavily biased by blockchains :P

Some related thoughts here

@alrevuelta
Copy link
Collaborator

Short term solution to fix this problem with all gathered feedback. wdyt?

  • Add "/waku/2/dev-waku/proto" topic to wakuv2.prod, on top of the default one.
  • Use said topic for demos, workshops, and 3rd party integrations (eg dappnode)
  • Recommend using wakuv2.prod and said new topic to anyone prototyping with waku.

Would this actions points comply with the requirements of this issue?

@jm-clius
Copy link
Author

jm-clius commented Feb 9, 2023

Currently blocked by: waku-org/nwaku#1545

@jakubgs
Copy link

jakubgs commented Feb 14, 2023

Please don't use spaces to separate values in CLI flags.

@fryorcraken
Copy link
Contributor

I am not sure this can be closed just yet. I think we need some follow-up actions:

  • Should we make js-waku examples use the dev pubsub topic?
  • Should we make chat2 on go-waku and nwaku use the dev pubsub topic?
  • Should js-waku, nwaku, go-waku use dev pubsub topic when no topic is specified?

@alrevuelta
Copy link
Collaborator

Should we make js-waku examples use the dev pubsub topic?

@fryorcraken Good points. Can we make a combo box in js-waku to pick between default/dev pubsub topic to publish/subscribe? This would be useful for demos since dev topic has almost not traffic. Can't image 100 people subscribing in a venue to the default topic. Initial plan was to use dev topic for eth denver cc @danisharora099

Should js-waku, nwaku, go-waku use dev pubsub topic when no topic is specified?

Since the spec only talks about the default pubsub topic, I would perhaps leave it as it is. Meaning when no topic is specific just subscribe to the default one.

@fryorcraken
Copy link
Contributor

@fryorcraken Good points. Can we make a combo box in js-waku to pick between default/dev pubsub topic to publish/subscribe? This would be useful for demos since dev topic has almost not traffic. Can't image 100 people subscribing in a venue to the default topic. Initial plan was to use dev topic for eth denver cc @danisharora099

Combo box is not really the way forward because that's not something a developer would want in their app, it just creates more confusion.

Since the spec only talks about the default pubsub topic

Good point. I think this warranties a RFC update to mention that the dev topic is recommended for development purposes.
Once RFC is updated, then I think it would be a good point in time to updates js-waku examples to dev topic. Possibly update chat2 too?
js-waku dev can then take over to decide how we want to update the API for that.

@fryorcraken
Copy link
Contributor

Re-opening as I believe we have pending items,

@fryorcraken fryorcraken reopened this Feb 22, 2023
@fryorcraken
Copy link
Contributor

From Discord:

Ok we said that we should make the dev topic by default on dappnode for nwaku.

Not much point doing that if nwaku on dappnode are the only nodes running on dev topic (with the fleets).

  • Should we then updates js-waku examples to use dev topics? I think that's fine to pass the pubsub topic option and justify it ("this is dev network") Or we can even provider a helper API
  • Do we also want to move chat2 across implementations to the dev network?
  • Finally, should we update nwaku operator docs to use dev network?

If we decide to use a dev network I think we need to do it properly across our ecosystem.

@fryorcraken
Copy link
Contributor

Trying a more constructive/direct approach. Here is a proposed action plan:

  • Change js-waku examples inc web-chat to use dev pubsub topic
  • Change go-waku examples inc chat2 to use dev pubsub topic
  • Change nwaku examples inc chat2 to use dev pubsub topic
  • js-waku doc wil need to include mention of dev pubsub topic
  • waku.guide concept doc to include doc of dev pubsub topic
  • nwaku operator doc to include mention of dev pubsub topic

@jm-clius
Copy link
Author

The initial purpose of this change was simply to have another pubsub topic available to developers (specifically aimed at DappNode and the ETHDenver demos). While I agree that we can document its use better, I think fully switching our strategy from default -> dev becomes less important once static sharding for communities etc will be in place, which should be soon-ish. Happy to follow a more comprehensive approach if that's what we agree on.

@fryorcraken
Copy link
Contributor

Right.

The issue is that the usage of the dev pubsub topic has already leaked beyond EthDenver as it is made as the default pubsub topic in nwaku DappNode package.

Something I don't fully agree with but was push to avoid issue with users staking Eth.

Hence we have already started switching strategy from default to dev. Any thoughts?

@jm-clius
Copy link
Author

jm-clius commented Aug 8, 2024

@fryorcraken closing this issue as not relevant given current shards approach.

@jm-clius jm-clius closed this as completed Aug 8, 2024
@jm-clius jm-clius closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

4 participants