Rebalancing problem with Faust Streaming consumer #594

arcanjo45 · 2023-12-20T11:08:11Z

Checklist

I have included information about relevant versions
I have verified that the issue persists when using the master branch of Faust.

Steps to reproduce

Hello everyone hope I find you well!

I'm facing an odd situation when using Faust Streaming in my consumer app. So I have a Kafka Consumer that connects to my Kafka GCP instance on my dev environment in Google Cloud. However when sometimes my Kafka instance restarts or goes down to lack of resources when my consumer tries to rebalance it stays stuck in a loop with the following errors logging:

[2023-12-20 10:23:47,912] [11] [INFO] Discovered coordinator 2 for group myapp-dev-processor 
[2023-12-20 10:23:47,912] [11] [INFO] (Re-)joining group myapp-dev-processor 
[2023-12-20 10:23:47,915] [11] [WARNING] Marking the coordinator dead (node 2)for group myapp-dev-processor.

This is happening frequently for us only on our dev environment but we are investigating what may be the root cause of this issue and how to tackle it so that if it occurs in prod we have an way to act fast. We know the consumer connects to our kafka instance with success but then this error happens and it stays stuck in an endless loop. We tried search for any error log on our kafka instances but we don't find anything so we think this may be a problem within the library somehow.

I can also say that the only fix we discovered until now is to re-deploy both the kafka instance on GCP and our consumer project again which is a thing we can't do in production if this situations occurs. I don't have detailed knowledge of the project but it seems to be a problem related with the topic the app creates to handle task distribution because the name of the topic is not the one from where we consume data but the one that is created by the app.

Does anyone have any idea why this may be happening? We tried to search in the project repo, chatgpt and stack overflow but without any luck.

Expected behavior

Consumer should rebalance partitions normally.

Actual behavior

Full traceback

Versions

faust-aioeventlet==0.6

faust-streaming==0.10.14

confluent-kafka==2.1.1

Python 3.12

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rebalancing problem with Faust Streaming consumer #594

Rebalancing problem with Faust Streaming consumer #594

arcanjo45 commented Dec 20, 2023

Rebalancing problem with Faust Streaming consumer #594

Rebalancing problem with Faust Streaming consumer #594

Comments

arcanjo45 commented Dec 20, 2023

Checklist

Steps to reproduce

Expected behavior

Actual behavior

Full traceback

Versions