You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have included information about relevant versions
I have verified that the issue persists when using the master branch of Faust.
Steps to reproduce
Hello everyone hope I find you well!
I'm facing an odd situation when using Faust Streaming in my consumer app. So I have a Kafka Consumer that connects to my Kafka GCP instance on my dev environment in Google Cloud. However when sometimes my Kafka instance restarts or goes down to lack of resources when my consumer tries to rebalance it stays stuck in a loop with the following errors logging:
[2023-12-20 10:23:47,912] [11] [INFO] Discovered coordinator 2 for group myapp-dev-processor
[2023-12-20 10:23:47,912] [11] [INFO] (Re-)joining group myapp-dev-processor
[2023-12-20 10:23:47,915] [11] [WARNING] Marking the coordinator dead (node 2)for group myapp-dev-processor.
This is happening frequently for us only on our dev environment but we are investigating what may be the root cause of this issue and how to tackle it so that if it occurs in prod we have an way to act fast. We know the consumer connects to our kafka instance with success but then this error happens and it stays stuck in an endless loop. We tried search for any error log on our kafka instances but we don't find anything so we think this may be a problem within the library somehow.
I can also say that the only fix we discovered until now is to re-deploy both the kafka instance on GCP and our consumer project again which is a thing we can't do in production if this situations occurs. I don't have detailed knowledge of the project but it seems to be a problem related with the topic the app creates to handle task distribution because the name of the topic is not the one from where we consume data but the one that is created by the app.
Does anyone have any idea why this may be happening? We tried to search in the project repo, chatgpt and stack overflow but without any luck.
Expected behavior
Consumer should rebalance partitions normally.
Actual behavior
Full traceback
Versions
faust-aioeventlet==0.6
faust-streaming==0.10.14
confluent-kafka==2.1.1
Python 3.12
The text was updated successfully, but these errors were encountered:
Checklist
master
branch of Faust.Steps to reproduce
Hello everyone hope I find you well!
I'm facing an odd situation when using Faust Streaming in my consumer app. So I have a Kafka Consumer that connects to my Kafka GCP instance on my dev environment in Google Cloud. However when sometimes my Kafka instance restarts or goes down to lack of resources when my consumer tries to rebalance it stays stuck in a loop with the following errors logging:
This is happening frequently for us only on our dev environment but we are investigating what may be the root cause of this issue and how to tackle it so that if it occurs in prod we have an way to act fast. We know the consumer connects to our kafka instance with success but then this error happens and it stays stuck in an endless loop. We tried search for any error log on our kafka instances but we don't find anything so we think this may be a problem within the library somehow.
I can also say that the only fix we discovered until now is to re-deploy both the kafka instance on GCP and our consumer project again which is a thing we can't do in production if this situations occurs. I don't have detailed knowledge of the project but it seems to be a problem related with the topic the app creates to handle task distribution because the name of the topic is not the one from where we consume data but the one that is created by the app.
Does anyone have any idea why this may be happening? We tried to search in the project repo, chatgpt and stack overflow but without any luck.
Expected behavior
Consumer should rebalance partitions normally.
Actual behavior
Full traceback
Versions
faust-aioeventlet==0.6
faust-streaming==0.10.14
confluent-kafka==2.1.1
Python 3.12
The text was updated successfully, but these errors were encountered: