Threads config for Pulsar Consumers #22375

danielnesaraj · 2024-03-28T08:50:02Z

danielnesaraj
Mar 28, 2024

Hi all!
I am looking into the thread config of the pulsar client. I found this: https://stackoverflow.com/questions/56954771/apache-pulsar-iothreads-listenerthreads-and-message-ordering
Please confirm if I am right here. Let's say I use the following code and call it three times to initiate 3 consumers in a kube pod (different topic names, though).

        pulsarClient
            .newConsumer(schema)
            .topic(topic)
            .subscriptionName(subscription)
            .subscriptionType(SubscriptionType.Shared)
            .messageListener(messageListener)
            .subscribe()

and my client is initiated like so

PulsarClient.builder().serviceUrl(pulsarClientUrl).build()

Then there is only 1 thread in which all 3 consumers are going to consume messages from their respective topics, right?
Assuming I have the requisite CPU cores allocated, would I then see performance improvement in doing the following:

PulsarClient.builder().serviceUrl(pulsarClientUrl).listenerThreads(3).build()

PS: I know there is also the option of using the receive() method in a thread to consume messages. I would like to keep that as last resort. If I can tweak the number of threads I can make available for the message listener to improve performance, that would be preferrable.

lhotari · 2024-03-28T10:01:41Z

lhotari
Mar 28, 2024
Collaborator

I am looking into the thread config of the pulsar client. I found this: https://stackoverflow.com/questions/56954771/apache-pulsar-iothreads-listenerthreads-and-message-ordering

Yes, the answer continues to apply to Pulsar client.

Then there is only 1 thread in which all 3 consumers are going to consume messages from their respective topics, right?
Assuming I have the requisite CPU cores allocated, would I then see performance improvement in doing the following:

Yes, this makes sense. It's worth validating your assumption by testing.

Thanks for sharing a good question where you already provided a helpful response too.

Regarding performance, adding threads isn't the only consideration. This diagram from Wikipedia's HTTP pipelining article illustrates the issue with one-by-one handling of messages:

For many use cases, it's possible to significantly reduce the number of consumers and increase throughput by using pipelining. In Pulsar, this requires using the async API. Pulsar Functions support pipelining too when the return type is a CompletableFuture, but the feature isn't well documented.

For key-ordered processing with pipelining, the Pulsar Reactive Client provides a solution based on Project Reactor's groupBy operator. This is one of the sweet spots of the Reactive Client, but there's not much documentation about it.

I made a conference presentation about the initial ideas in SpringOne 2021. The presentation is slightly outdated since Spring Pulsar has come out after that. However, the code examples have been updated to use Spring Pulsar and Pulsar Reactive client since then. The code examples for pipelining having been updated.

For pipelining, Pulsar Reactive client has built-in support in the ReactiveMessagePipeline. (implementation, single testcase which isn't a good usage example), however, the documentation isn't great for this feature that unlocks key-ordered processing with a configurable concurrency/parallelism level.
The javadoc contains some limited docs: https://github.com/apache/pulsar-client-reactive/blob/b8b42df975c4bb2cd275936402d6439087695554/pulsar-client-reactive-api/src/main/java/org/apache/pulsar/reactive/client/api/ReactiveMessagePipelineBuilder.java#L122-L142
Since Project Reactor has great support for retries, it's very simple to implement reliable integration pipelines leveraging these features. I hope I could have time to do an updated conference talk with updates to docs and examples to explain all of this.

1 reply

danielnesaraj Mar 28, 2024
Author

Thank you for the comprehensive response @lhotari ! This helps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threads config for Pulsar Consumers #22375

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Threads config for Pulsar Consumers #22375

danielnesaraj Mar 28, 2024

Replies: 1 comment · 1 reply

lhotari Mar 28, 2024 Collaborator

danielnesaraj Mar 28, 2024 Author

danielnesaraj
Mar 28, 2024

Replies: 1 comment 1 reply

lhotari
Mar 28, 2024
Collaborator

danielnesaraj Mar 28, 2024
Author