Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kafka module can fail to collect partition offsets for topics with many partitions #13380

Closed
jsoriano opened this issue Aug 28, 2019 · 7 comments
Labels
bug Metricbeat Metricbeat module Stalled Team:Integrations Label for the Integrations team

Comments

@jsoriano
Copy link
Member

jsoriano commented Aug 28, 2019

In some Kafka clusters with topics with many partitions there can be problems collecting the offsets, what causes many errors that can flood logs.

This is possibly caused by the way Metricbeat collects the partitions offsets. It gets first the metadata and then it gets the offsets partition by partition. Under some circumstances, with many partitions, some errors happen, like:

  • kafka server: Tried to send a message to a replica that is not the leader for some partition. Your metadata is out of date.
  • broker is not leader

There are two possible solutions for these errors, but it may be good to implement both:

  • Offset requests can request several partition offsets at once. Try to request all the offsets in a single request, or at least split them in less requests. It may be needed to make one request per broker, as the offsets for each partition should be requested to their leaders.
  • Handle the error about out of date metadata, and request the metadata again to continue requesting partitions.
@botelastic
Copy link

botelastic bot commented Oct 22, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the Stalled label Oct 22, 2020
@jsoriano jsoriano removed the Stalled label Oct 23, 2020
@alinazemian
Copy link

Is there any workaround for this issue?

@jsoriano
Copy link
Member Author

Is there any workaround for this issue?

@MRaliagha I am afraid that not at the moment 🙁 Are you being hit by this issue? Do you see errors about outdated metadata?

@alinazemian
Copy link

alinazemian commented Feb 4, 2021

@MRaliagha I am afraid that not at the moment 🙁 Are you being hit by this issue? Do you see errors about outdated metadata?

Yes, we are facing this issue with 7.10.1.

@botelastic
Copy link

botelastic bot commented Jan 27, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@botelastic botelastic bot added the Stalled label Jan 27, 2022
@jsoriano jsoriano removed the Stalled label Jan 27, 2022
@alinazemian
Copy link

I'm still having this issue with version 7.16.1.

@botelastic
Copy link

botelastic bot commented Jan 30, 2023

Hi!
We just realized that we haven't looked into this issue in a while. We're sorry!

We're labeling this issue as Stale to make it hit our filters and make sure we get back to it as soon as possible. In the meantime, it'd be extremely helpful if you could take a look at it as well and confirm its relevance. A simple comment with a nice emoji will be enough :+1.
Thank you for your contribution!

@botelastic botelastic bot added the Stalled label Jan 30, 2023
@botelastic botelastic bot closed this as completed Jul 29, 2023
@zube zube bot removed the [zube]: Done label Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Metricbeat Metricbeat module Stalled Team:Integrations Label for the Integrations team
Projects
None yet
Development

No branches or pull requests

3 participants