Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add pushgateway related operations #15

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

yamingwa
Copy link

Add pushgateway related operations. User can choose to push metrics to the pushgateway endpoint provided by config file as well as to delete metrics from it.

Added attributes of config.yml
Global section:
pushgateway_addr: pushgateway address, only support : pattern.

Metric section:
pushgateway: specify if this metric needs to be pushed to pushgateway, default is false.
job_name: specify job name when pushing metric to pushgateway, default is grok_exporter.
delete_match: similar to match attribute, logs which match this pattern will trigger a delete() operation from pushgateway.
grouping_key: specify the grouping key when pushing metric to pushgateway, default is nil.

All of the added attributes are not required, thus the original config.yml still works.

@yamingwa
Copy link
Author

yamingwa commented Jun 1, 2017

@fstab The CI test for file tailer has failed because of timeout, but I did it locally and there was no such failure. Also I think this has no effect for the pushgateway functions. Please take a look at this PR, thanks in advance.

@fstab
Copy link
Owner

fstab commented Jun 1, 2017

Thanks for the PR! I am currently on holidays with very limited access to the Internet, but I will get back to it next week.

@yamingwa
Copy link
Author

yamingwa commented Jun 7, 2017

@fstab Sorry for the late reply. I appreciate your response.

@fstab
Copy link
Owner

fstab commented Jun 12, 2017

I reviewed the changes and have a couple of questions to understand the code better:

  • Could you provide a few example log lines and a config file where all pushgateway-related options are used? Having a real live example would be helpful to me.
  • What would you say is the intended usage scenario? I can imagine the pushgateway is useful if grok_exporter runs behind a NAT firewall. Is this the intended use?
  • As the pushgateway support is optional, I would like to move the code out of the core implementation into its own package. It would be easy to move the new functions in grok_exporter.go to a new package, but I am not sure what to do with the extensions to metrics.go. Maybe a few of the new fields in metrics.go are not really needed:
    • The pushgateway flag was introduced to push only specific metrics to the pushgateway. I don't see a usage scenario where it makes sense to collect metrics in grok_exporter, but not push them to the pushgateway. What do you think of removing the pushgateway flag and pushing all metrics to the gateway as soon as the pushgateway is configured?
    • The metricsVec was introduced to call m.MetricVec().DeleteLabelValues(labelValues...) in pushMetrics() in grok_exporter.go. What do you think of skipping this and keeping the labels? If the metric is a Summary or Histogram, it might be useful to collect more than one value with the same labels.
    • What is the difference between the grouping key and other labels? Can we use the existing labels for grouping instead of the grouping key?
    • I guess the delete_regex would be useful to delete metrics if some kind of shutdown message is found in a log file indicating that a service has stopped. This is an interesting idea, but maybe it is unrelated to the pushgateway and would be useful to the regular grok_ exporter as well. Is there any reason that metrics are deleted from the pushgateway but not from the regular grok_exporeter?

Thanks a lot!
Fabian

@yamingwa
Copy link
Author

@fstab Thanks for the response.

  • Here is the example log file and config file.
    log file:
    test.log
    config file:
    config.yml

  • The usage scenario for us is to monitor task waiting time cross node in large scale cluster. Sometimes a task can be created in one node and arranged to be executed in another node. The time interval between its creation time and start-execution time is an important metric to monitor. We can use time() - creation time in Prometheus to get the actual waiting time.
    Since we have to monitor every task and use a some kind "ID" label to specify it, and this will overload the storage of Prometheus in a long time. so we have to delete the metric as soon as the task is no longer in a "waiting" state. Pushgateway provides delete() API to do so.

  • For the new fields:

    • pushgateway: The push() and delete() operations provided by pushgateway are somehow HTTP requests. It may cause a huge overhead if we push all of the metrics defined in grok_exporter to pushgateway. In fact, we just want to focus some special metrics and this flag can provide an option to users to determine whether to push the metric or not.

    • m.MetricVec().DeleteLabelValues(labelValues...) is actually not deleting labels but deleting metrics with the given label-value sets. This is actually the main found for me in metrics_test.go. If we can implement some delete() function in grok_exporter, the pushgateway is no longer needed I think.

    • grouping key is used when pushing metric to pushgateway. By default, only job_name is required when doing push() operation. In package client_golang provided by Prometheus, grouping key can not be a subset of metric labels and it can be some user-defined values for separating the pushed metrics into different groups. I re-implement the push() API, and put the code in doRequest() ingrok_exporter.go. Here grouping key is also a subset of labels. The value or template it used is determined when regular match attribute has been matched.

    • delete_regex is doing exactly what you mentioned. For deleting metrics in pushgateway but not in grok_exporter, as I mentioned before, if we can implement some kind delete() function in grok_exporter, the pushgateway is no longer needed.

I am not so good at Go and I apologize for making the code difficult to understand. Hopefully the comment above can explain the code clearly.
Thanks.

@fstab
Copy link
Owner

fstab commented Jun 19, 2017

Thanks a lot for your response. I understand your usage scenario much better now.

I really like the idea to be able to delete metrics. This should be a general feature in grok_exporter, not only for the pushgateway. As a general feature, it might be more straightforward to rename the grouping_key to something like delete_labels, because the term "grouping" does not occur outside of the pushgatwasy world:

delete_match: 'DELETE %{USERNAME:user}'
delete_labels:
    user: '{{.user}}'

As you said, there is no simple delete() function in the client_golang for deleting all metrics matching the delete_labels. However, it should be straightforward to implement this. There should be a relatively small amount of different labels. In the worst case, we could simply keep a map of all label values and call DeleteLabelValues() individually for each value.

I think the best way to proceed is to implement the generic delete functionality first, and then migrate your pushgateway code to the new version of grok_exporter. I will try to provide a delete implementation within the next few days.

Thanks four your contribution, it is always good to see different usage scenarios and get some fresh ideas. I'll get back to this issue when the delete functionality is done.

@yamingwa
Copy link
Author

Glad to know this idea is accepted by the author @fstab . I really appreciate your response. Thanks a lot!.

@fstab
Copy link
Owner

fstab commented Jun 29, 2017

I pushed a delete_match and delete_labels implementation. The new function metric.ProcessDelete(line) returns a map with the matching delete_labels and their values. I hope the return value of metric.ProcessDelete(line) is useful for calling delete on the pushgateway. Let me know if this makes sense for you.

@yamingwa
Copy link
Author

@fstab Thanks, I'll check it out and give you a response later.

@yamingwa
Copy link
Author

@fstab Sorry for late response, I was engaged in some other projects in our company in the last a few weeks, I will try to use this new functions in my pushgateway branch and give you a detailed reply in the next weeks. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants