Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add imporovement #238

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

add imporovement #238

wants to merge 4 commits into from

Conversation

panszobe
Copy link

Main Improvements

  • Adjust the order of fetching messages , get consumer group offsets firstly before getting topics and partitions messages, and then calculate, to solve consumer group lag negative issue.
  • Add concurrency control, and share the same result to all the demands on a fetching period, which would have an appreciable improvement of performance, especially when there is a broker down.

offset := make(map[string]map[int32]int64)
groupOffset := make(map[int32]map[string]map[string]map[int32]int64)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This map becomes more complicated, please add comments to describe the meaning of each key.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added.

@danielqsj
Copy link
Owner

@panszobe thanks for the big PRs. Generally looks good to me.
But due to the complex of PR, please have a check and give some test results.

@danielqsj
Copy link
Owner

@panszobe is this PR based on #228 ?

@danielqsj danielqsj mentioned this pull request Jun 28, 2021
@panszobe
Copy link
Author

@panszobe thanks for the big PRs. Generally looks good to me.
But due to the complex of PR, please have a check and give some test results.

Yes, we met two main problems on our production environment a few weeks ago:

  1. users reflected that consumer group lag was negative unexpectedly
  2. when one broker down, fetching metrics will take long time

Therefore, we wanted to find solutions from PRs and issue records, and found #213,but we just pick up something which fit our demands.
On the basis of concurrency control, we imported adjusting order of fetching metrics.
During the fixing, we found that code compile is in trouble, because github.com/prometheus/common with version 0.29.0 has no log module, today i see is altered to 0.25.0 back. We used old version 0.25.0 too at that time.
Through gray updating, we push it all on production ENV.

After doing that, we gained something good as below:

  1. when one broker down, time spent is decreased from 30s to 20s, about 1/3 improvement
  2. there is no negative value of metric about consumer group lag

@danielqsj
Copy link
Owner

@panszobe I have cherry-picked from #228, and released v1.4.1. Welcome to rebase the current master branch.

@Eriln
Copy link

Eriln commented Jan 18, 2022

Hi @panszobe, we are interested in your modifications. will you have time to rebase ?

@Eriln Eriln mentioned this pull request Jan 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants