RabbitMQ 3.10 changed behaviour of unacknowledged messages #585

keifgwinn · 2022-07-21T11:16:44Z

Description

There was an issue on production captured in https://github.com/3drepo/DevOps/issues/457 that was resolved with a configuration change on rabbitmq as they changed the default behaviour of unacknowledged queues in RabbitMQ 3.10. This change was also backported to 3.8.15

They've also deprecated 'classic queues' and recommend new quorum queues https://www.rabbitmq.com/ha.html

We are also seeing higher queue numbers as the system gets busier, and unacknowledged messages, due to the high number of channel exceptions, we also encountered this 'stuck' queue

RabbitMQ Uacked Messages are unacknowledged messages. In RabbitMQ, when many messages are delivered to the consumer or the target. But according to protocols, it is not guaranteed that message delivery will always be successful. So to solve this, Publishers and Consumers require a mechanism for delivery and processing confirmation. This is where Acknowledgement and Unacknowledgment. A message is ready when it is waiting to be processed. Whenever a consumer connects to the queue it receives a batch of messages to process. Meanwhile, the consumer is working on the messages the amount is given in prefetch size and they get the message unacked. RabbitMQ Unacked Messages are the messages that are not Acknowledged.

If a consumer fails to acknowledge messages, the RabbitMQ will keep sending new messages until the prefetch value set for the associated channel is equal to the number of RabbitMQ Unacked Messages count. If RabbitMQ Unacked Messages are available then it will make the “struck” messages available again, and the client process will automatically reconnect. RabbitMQ Unacked Messages are read by the consumer but the consumer has never sent an ACK or confirmation to the RabbitMQ broker to say that it has finished processing it.

Goals

Review our use of queues to closer match the patterns expected by RabbitMQ developers.

Tasks

TBD

Related Resources

RabbitMQ ChangeLog

Current queue configuration

carmenfan · 2022-07-21T12:18:55Z

as discussed in teams, we could ack the message as soon as we received it, but we will need to requeue a message should the processing failed in an unexpected way, currently there's 2 ways this may happen:

Licensing failure - this is easily resolved by code changed as we are in control at the point of failure
If the bouncer_worker was terminated unexpectedly, this mostly happens when AWS recalls the machine and the nodeJS application gets killed. Question is, is there anyway the nodeJS can get a signal before this happens so we can requeue a message? Do we get a signal that we can catch (like SIGTERM?)

keifgwinn · 2022-07-21T13:15:12Z

If the node is going away due to AWS initiated activity, we have the aws-node-termination-handler installed and currently configured in metadata monitoring mode so we should get notified that the hardware will be going away.

Currently installed like this


#AWS packages, termination handler, cloudwatch metrics  
helm upgrade --install \
 --force aws-node-termination-handler \
 --namespace kube-system \
 --set enableSpotInterruptionDraining="true" \
 --set enableRebalanceMonitoring="true" \
 --set enableScheduledEventDraining="true" \
 --set emitKubernetesEvents="true" \
 --set taintNode="true" \
 eks/aws-node-termination-handler --version 0.13.3

In that event, the nodes get 'tainted' or set that they're going to be offlined, and kubernetes should issue a signal to the pods that they're getting terminated and pods gracefully shut down under our current configuration TERM is sent to node

The kubelet triggers the container runtime to send a TERM signal to process 1 inside each container.

so we should be able to catch it.

However, occasionally there may be a hardware error underneath that will not follow this process.

carmenfan added the Maintenance label Jul 22, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RabbitMQ 3.10 changed behaviour of unacknowledged messages #585

RabbitMQ 3.10 changed behaviour of unacknowledged messages #585

keifgwinn commented Jul 21, 2022 •

edited

Loading

carmenfan commented Jul 21, 2022

keifgwinn commented Jul 21, 2022 •

edited

Loading

RabbitMQ 3.10 changed behaviour of unacknowledged messages #585

RabbitMQ 3.10 changed behaviour of unacknowledged messages #585

Comments

keifgwinn commented Jul 21, 2022 • edited Loading

Description

Goals

Tasks

Related Resources

carmenfan commented Jul 21, 2022

keifgwinn commented Jul 21, 2022 • edited Loading

keifgwinn commented Jul 21, 2022 •

edited

Loading

keifgwinn commented Jul 21, 2022 •

edited

Loading