Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pub/Sub cluster performance issue #2343

Open
giridharkannan opened this issue Mar 1, 2023 · 6 comments
Open

Pub/Sub cluster performance issue #2343

giridharkannan opened this issue Mar 1, 2023 · 6 comments
Labels
for: team-attention An issue we need to discuss as a team to make progress type: feature A new feature
Milestone

Comments

@giridharkannan
Copy link

Feature Request

Instead of subscribing for a channel to a random node in a cluster, we must subscribe to the node which contains the slot

Is your feature request related to a problem? Please describe

There is a 30 to 40-millisecond delay during inter-cluster publish message broadcast at the Redis Server. Because of this, when subscribing a channel to a different node ends up increasing the latency to 20x

I have built a lightweight RPC server using REDIS, and because of this issue, when in random node subscription case, I process 63 req/sec, and during targeted subscription can process 1250 req/sec

Describe the solution you'd like

Subscribe request must choose the actual node where its key will belong and must subscribe in that node

@egoissst
Copy link

egoissst commented Mar 10, 2023

Instead of subscribing for a channel to a random node in a cluster
How are you subscribing to a random node ?

are you using ? -
redisClient.connectPubSub().sync().subscribe(channelName)
The default lettuce API only subscribes the the default connection (and hence, a single node) -

And if you use something like redisClient.connectPubSub().getConnection(<>).sync().subscribe(), then the re-connection behaviour (upon failover, for example) doesn't occur as expected. The lettuce client does not recreate the subscriptions on a new redis shard automatically. It just keeps on trying to reconnect on the older shard. Same with nodeSelection APIs.

Just had this question, so asking. I was trying to distribute subscribe calls to different nodes but not able to determine how to do that

@giridharkannan
Copy link
Author

I am using StatefulRedisClusterPubSubConnection.sync().subscribe. It's a cluster setup

@egoissst
Copy link

yeah, I am using the same StatefulRedisClusterPubSubConnection.sync().subscribe() call.
But for multiple calls of this method for different channelNames, isn't the subscriptions happening on a single redis node - the one which it initially connects to ?
For me, I want to at least distribute the subscription creation traffic to different redis nodes so that load is distributed. But unable to figure out the same. If you know how to properly distribute the subscriptions across different redis nodes, please let me know.

As for your original questions in the Issue raised by you, there is already an open issue - #1984. I would also love to have this sharded pub sub functionality for my use case. However, there hasn't been any update on it for more than a year, not sure when it will be made live.

@giridharkannan
Copy link
Author

Distributing load will not solve the problem. If we are to register for 9 channels and if all the channels belong to one node (as per the shared) in a cluster of 3, and we are to distribute equally. 6 channels will have an latency spike of 30 to 40 ms for every message.

Instead I want the client to subscribe to the respective node where it's slot belong to.

Right now I am using
StatefulRedisClusterConnection.getPartitions().getPartitionBySlot(SlotHash.getSlot(channel)).getNodeId() to get the respective nodeId. From nodeId I get StatefulRedisConnection via StatefulRedisClusterConnection and use that for subscribing

The issue you have pointed out is a new command which solves the problem in a better way on the REDIS layer and is available only from REDIS 7.0. I would still like to have a better solution for the present SUBSCRIBER command

@egoissst
Copy link

egoissst commented Mar 13, 2023

Thanks for your response.

Right now I am using
StatefulRedisClusterConnection.getPartitions().getPartitionBySlot(SlotHash.getSlot(channel)).getNodeId() to get the respective nodeId. From nodeId I get StatefulRedisConnection via StatefulRedisClusterConnection and use that for subscribing

I was also trying to use this to create node-specific subscriptions, but I am facing resiliency issues with using this approach.
Lettuce is not recreating subscriptions for the node on failover (in case of Elasticache) or directly killing the redis node (in case of a local cluster).

Using below code, the subscriptions will be recreated by lettuce immediately upon detecting connection issue -

RedisClusterClient redisClient = RedisClusterClient.create(redisURI);
StatefulRedisClusterPubSubConnection<String, String> subscribeConnection = redisClient.connectPubSub();
subscribeConnection.sync().subscribe("channel1");

But if we use the below code (basically node-specific subscriptions), then, on failover/killing a redis node, lettuce is not re-creating the subscriptions on another redis node. The subscriptions are just lost.

String nodeId = subscribeConnection.getPartitions().getPartitionBySlot(SlotHash.getSlot("channelName")).getNodeId()
subscribeConnection.getConnection(nodeId).sync().subscribe("channel2");

I also facing the same issue when trying node-selecting predicate

subscribeConnection.sync().nodes(
	node -> node.getNodeId() == nodeId).commands()
	.subscribe(channelName);

I am not sure how to leverage lettuce connection resiliency while also using node-specific subscriptions. Would you be able to help on this ?

@giridharkannan
Copy link
Author

I tried testing it locally, seems lettuce is reconnecting and and resubscribing to the node once the node comes back

@tishun tishun added this to the Backlog milestone Jul 17, 2024
@tishun tishun added type: feature A new feature for: team-attention An issue we need to discuss as a team to make progress labels Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
for: team-attention An issue we need to discuss as a team to make progress type: feature A new feature
Projects
None yet
Development

No branches or pull requests

3 participants