-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Zypper updates and patches script metrics #222
base: master
Are you sure you want to change the base?
Conversation
35d5e57
to
4fdd60f
Compare
I notice that several of the metrics are specified as counters, but they will presumably fluctuate up and down as the number of available updates ebbs and flows. As such, they should be designated as Whilst this repo accepts both shell scripts and Python scripts, once a shell script starts to include non-trivial amounts of awk, I think my personal feeling would be that it's time to write it in Python - where you can leverage the official Prometheus Python client library. |
Ok, I'll create the python script so we can compare what are the most complex to maintain or read 👍 |
c95cf87
to
2c54027
Compare
Hi @dswarbrick I've transcript the basch script into python script. Let us know what do you think about it! |
@Wabri I won't take a serious look at this PR until the lint CI job is green. However, you seem to have overlooked my comment where I said that Python scripts can utilize the official Prometheus Python client library. This was more than just a suggestion - it's a requirement for Python scripts in this repo. It avoids the need for people to (re)invent their own text exposition format functions, which are often incomplete or buggy. |
5ef22b9
to
4eb861d
Compare
@dswarbrick let me know if there is anything to change in the code using the prometheus python client library. |
@Wabri There seems to be some miscommunication here, because your Python script does not make use of the official Prometheus Python client library at all. Let me be clear. Python code such as this will not be accepted by this repo: print('# HELP zypper_patches_pending_reboot_total zypper patches available which require '
'reboot total')
print('# TYPE zypper_patches_pending_reboot_total counter')
print_patches_sum(data_zypper_lp,
prefix="zypper_patches_pending_reboot_total",
filters={'Interactive': 'reboot'}) The rendering of text exposition format metrics is handled by the client_python library, avoiding the need for users to implement their own (often incomplete and / or buggy) methods. Documentation for client_python can be found at https://prometheus.github.io/client_python/, and you can also look at the existing Python scripts in this repo for some examples of how to use it. Jumping ahead a little bit, I also see that there is no error checking or handling being performed when calling the |
No miscommunication, I understand that you want the library in my python script, but I've never use it and in my previous comment I was just asking where do I need to use it. I saw that the library is lacking of documentation, for me at least. Thanks for the clarification, I'll go ahead with the changes you ask to do and I'll be back. |
3014ba2
to
68a414f
Compare
Hi @dswarbrick! I've done some changes:
I've not tested the scripts yet, I'll set the pr ready for review when I've done it! |
I literally posted a link to the client_python documentation in my previous comment: https://prometheus.github.io/client_python/ |
You are right, I meant lacking "a good" documentation. But it is a personal thought. |
Co-authored-by: Bernd Schubert <[email protected]> Signed-off-by: Wabri <[email protected]> Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]> Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: Wabri <[email protected]>
Signed-off-by: Bernd Schubert <[email protected]> Signed-off-by: Wabri <[email protected]>
Co-authored-by: Daniel Swarbrick <[email protected]> Signed-off-by: GabrielePuliti <[email protected]> Signed-off-by: Wabri <[email protected]>
Co-authored-by: Daniel Swarbrick <[email protected]> Signed-off-by: GabrielePuliti <[email protected]> Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
Signed-off-by: Wabri <[email protected]>
@dswarbrick tested and it works as expected! |
def __parse_arguments(argv): | ||
parser = argparse.ArgumentParser() | ||
|
||
parser.add_mutually_exclusive_group(required=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks incorrect. I think what you meant to do is
group = parser.add_mutually_exclusive_group(required=False)
group.add_argument("-m", ...)
group.add_argument("-l", ...)
However, this combination of "more / less" in a mutually exclusive group, and with a default that sets the --more
option true (thus overriding the fact you set the group required=False
), is really unintuitive from a user's perspective.
You have essentially just reinvented a boolean option, which by definition can only be one of two values, i.e. mutually exclusive.
Please re-work this to something that is a bit clearer.
"zypper package update available from repository. " | ||
"(0 = not available, 1 = available)" | ||
) | ||
info = Info(prefix, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You specify a metric description that includes "(0 = not available, 1 = available)", yet Info
metrics cannot take a user-defined value, and always have a value of 1. I'm not sure what your intentions are here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dswarbrick Thanks for your comments and effort to review it. The explanation for 0 and 1 in the help section clarifies what value can be expected to get from the metrics. Even if it is clear for some people not of of the users will have the same knowledge from the beginning. From my point of view having this hint mentioned here does hurt anyone and makes the usage a bit easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pirat013 The issue is not with having that information in the metric description. The issue is that using an Info
metric type does not allow the value to be specified - they always have a value of 1. Encoding the 0 or 1 status as a label is not an acceptable approach, due to its propensity to create new series and result in stale metrics when that label value changes. Using a Gauge
metric, and setting the metric value to 0 or 1, is the only correct way to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, got it.
("Status", "status")] | ||
prefix = "zypper_patch_pending" | ||
description = "zypper patch available from repository. (0 = not available , 1 = available)" | ||
info = Info(prefix, description) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment above regarding Info
metric types.
|
||
def print_reboot_required(): | ||
needs_restarting_path = '/usr/bin/needs-restarting' | ||
is_path_ok = os.path.isfile(needs_restarting_path) and os.access(needs_restarting_path, os.X_OK) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is overly paranoid (and not atomic). A more elegant approach would simply be to catch subprocess.CalledProcessError
exceptions thrown by subprocess.run
.
I agree, the online HTML documentation is a bit sparse, mostly only showing example usage. However, client_python, like most Python code, is effectively self-documenting. For example:
|
stderr=subprocess.DEVNULL, | ||
check=False) | ||
if result.returncode == 0: | ||
info.info({"node_reboot_required": "0"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once again you are abusing the Info
metric type. Specifying a 0 or 1 value as a label will create new series whenever that value changes, leading to stale metrics due to Prometheus' look-behind mechanism. What you need to use is a Gauge metric.
No description provided.