Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync eventhub bufferedproducer does not respect max_wait_time with threads<partitions #38961

Open
epa095 opened this issue Dec 20, 2024 · 3 comments
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs feature-request This issue requires a new behavior in the product in order be resolved. Messaging Messaging crew needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that

Comments

@epa095
Copy link

epa095 commented Dec 20, 2024

  • azure-eventhub:
  • 5.13.0:
  • 3.11:

Describe the bug
We use EventHubProducerClient (not async version) with buffered_mode=True. We noticed that for our 32-partition EH that messages in partition 8-31 always arrived in batches of size 1500, much later than max_wait_time (sometimes days after). Events in the earlier partitions arrived as expected.

We happen to be running this on a 4-core azure container app.

We noticed that if we set buffer_concurrency to 32 then max_wait_time seems respected for all partitions.

I think the problem is that the function check_max_wait_time_worker runs as an infinite loop (in the async version the sleep is awaited, which is an important detail) and is submitted to a shared threadpoolexecutor. If we do not set buffer_concurrency then the default threadpoolexecutor is made, with min(32, os.cpu_count() + 4) = 8 threads. That is also the max amount of concurrent tasks the executor can process (even if the task is to sleep), so check_max_wait_time_worker for the higher partition numbers are never executed on the scheduler.

@github-actions github-actions bot added Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that labels Dec 20, 2024
Copy link

Thank you for your feedback. Tagging and routing to the team member best able to assist.

@kashifkhan kashifkhan added the Messaging Messaging crew label Dec 20, 2024
@kashifkhan
Copy link
Member

@epa095 thank you for the feedback. I agree that our docstring and troubleshooting guide should highlight the relationship of buffer concurrency and the number of EventHub partitions. Our general recommendation is that there should be a worker per partition.

@epa095
Copy link
Author

epa095 commented Dec 20, 2024

@epa095 thank you for the feedback. I agree that our docstring and troubleshooting guide should highlight the relationship of buffer concurrency and the number of EventHub partitions. Our general recommendation is that there should be a worker per partition.

Then I propose that the client by default uses a threadpoolexecutor with one worker per partition. It knows the number of partitions, so it can by default do the right thing.

@kashifkhan kashifkhan added the feature-request This issue requires a new behavior in the product in order be resolved. label Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Client This issue points to a problem in the data-plane of the library. customer-reported Issues that are reported by GitHub users external to the Azure organization. Event Hubs feature-request This issue requires a new behavior in the product in order be resolved. Messaging Messaging crew needs-team-attention Workflow: This issue needs attention from Azure service team or SDK team question The issue doesn't require a change to the product in order to be resolved. Most issues start as that
Projects
None yet
Development

No branches or pull requests

4 participants