rabbitmq_node_up reports it self up and the other cluster nodes #70

expanderbolt · 2019-01-23T14:19:51Z

I get the double amount of metrics since the exporter counts every node twice (it reports its cluster friend too).
Why?
Example:

root@xxx: curl http://localhost:15672/api/metrics|grep node_up
#HELP rabbitmq_node_up Node runnning status
rabbitmq_node_up{name="[email protected]",type="disc"} 1
rabbitmq_node_up{name="[email protected]",type="disc"} 1

The nodes are clustered with https://www.rabbitmq.com/cluster-formation.html#peer-discovery-aws

The text was updated successfully, but these errors were encountered:

michaelklishin · 2019-02-20T08:37:19Z

There isn't enough information to tell for sure but my best guess is that when every node is queried and reports all of its peers are up, the counters are added instead of treated as a boolean gauge.

BoemmLA · 2019-03-04T12:59:03Z

Seems that rabbitmq_node_up reports reachable nodes from each node itself ...
So in a 3 node cluster setup reports each node itself and the 2 other nodes which sums up to 9 rabbitmq_node_up metrics.
Well that can be used to show something like
nodeXX can reach X nodes ...

Its a bit confusing, if you run rabbitmq as cluster, since this is not well prepared I would say.
Try to map the IP of the server you query against the name ... should work.

Especially what I miss is simple the cluster name of the whole rabbitmq cluster, to by able to say:
Cluster is up with X nodes ...
Seems this info is not exported by the plugin ...

michaelklishin · 2019-03-06T03:34:22Z

We recently discussed this and concluded that there are significant benefits to collecting data from a single node and then aggregating at "display time". This should be covered in the docs now that #73 was merged.

@BoemmLA I'm sorry but I think you are greatly oversimplifying how distributed systems fail. If a node has 3 clusters but A cannot reach C and C cannot reach B but all other links are up, how many nodes does that cluster have? So this is a very convenient and very misleading metric. To some extent it is clarified by another one, the number of reported partitions in the cluster. Then we not only monitor the number of vertices in the graph but also the number of problematic edges.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rabbitmq_node_up reports it self up and the other cluster nodes #70

rabbitmq_node_up reports it self up and the other cluster nodes #70

expanderbolt commented Jan 23, 2019 •

edited

Loading

michaelklishin commented Feb 20, 2019

BoemmLA commented Mar 4, 2019

michaelklishin commented Mar 6, 2019

rabbitmq_node_up reports it self up ***and*** the other cluster nodes #70

rabbitmq_node_up reports it self up ***and*** the other cluster nodes #70

Comments

expanderbolt commented Jan 23, 2019 • edited Loading

michaelklishin commented Feb 20, 2019

BoemmLA commented Mar 4, 2019

michaelklishin commented Mar 6, 2019

rabbitmq_node_up reports it self up and the other cluster nodes #70

rabbitmq_node_up reports it self up and the other cluster nodes #70

expanderbolt commented Jan 23, 2019 •

edited

Loading