Skip to content

Commit

Permalink
Fix dead lettering (backport #11174) (#11219)
Browse files Browse the repository at this point in the history
* Fix dead lettering

  # What?

This commit fixes #11159, #11160, #11173.

  # How?

  ## Background

RabbitMQ allows to dead letter messages for four different reasons, out
of which three reasons cause messages to be dead lettered automatically
internally in the broker: (maxlen, expired, delivery_limit) and 1 reason
is caused by an explicit client action (rejected).

RabbitMQ also allows dead letter topologies. When a message is dead
lettered, it is re-published to an exchange, and therefore zero to
multiple target queues. These target queues can in turn dead letter
messages. Hence it is possible to create a cycle of queues where
messages get dead lettered endlessly, which is what we want to avoid.

  ## Alternative approach

One approach to avoid such endless cycles is to use a similar concept of
the TTL field of the IPv4 datagram, or the hop limit field of an IPv6
datagram. These fields ensure that IP packets aren't cicrulating forever
in the Internet. Each router decrements this counter. If this counter
reaches 0, the sender will be notified and the message gets dropped.

We could use the same approach in RabbitMQ: Whenever a queue dead
letters a message, a dead_letter_hop_limit field could be decremented.
If this field reaches 0, the message will be dropped.
Such a hop limit field could have a sensible default value, for example
32. The sender of the message could override this value. Likewise, the
client rejecting a message could set a new value via the Modified
outcome.

Such an approach has multiple advantages:
1. No dead letter cycle detection per se needs to be performed within
   the broker which is a slight simplification to what we have today.
2. Simpler dead letter topologies. One very common use case is that
   clients re-try sending the message after some time by consuming from
   a dead-letter queue and rejecting the message such that the message
   gets republished to the original queue. Instead of requiring explicit
   client actions, which increases complexity, a x-message-ttl argument
   could be set on the dead-letter queue to automatically retry after
   some time. This is a big simplification because it eliminates the
   need of various frameworks that retry, such as
   https://docs.spring.io/spring-cloud-stream/reference/rabbit/rabbit_overview/rabbitmq-retry.html
3. No dead letter history information needs to be compressed because
   there is a clear limit on how often a message gets dead lettered.
   Therefore, the full history including timestamps of every dead letter
   event will be available to clients.

Disadvantages:
1. Breaks a lot of clients, even for 4.0.

  ## 3.12 approach

Instead of decrementing a counter, the approach up to 3.12 has been to
drop the message if the message cycled automatically. A message cycled
automatically if no client expliclity rejected the message, i.e. the
mesage got dead lettered due to maxlen, expired, or delivery_limit, but
not due to rejected.

In this approach, the broker must be able to detect such cycles
reliably.
Reliably detecting dead letter cycles broke in 3.13 due to #11159 and #11160.

To reliably detect cycles, the broker must be able to obtain the exact
order of dead letter events for a given message. In 3.13.0 - 3.13.2, the
order cannot exactly be determined because wall clock time is used to
record the death time.

This commit uses the same approach as done in 3.12: a list ordered by
death recency is used with the most recent death at the head of the
list.

To not grow this list endlessly (for example when a client rejects the
same message hundreds of times), this list should be compacted.
This commit, like 3.12, compacts by tuple `{Queue, Reason}`:
If this message got already dead lettered from this Queue for this
Reason, then only a counter is incremented and the element is moved to
the front of the list.

  ## Streams & AMQP 1.0 clients

Dead lettering from a stream doesn't make sense because:
1. a client cannot reject a message from a stream since the stream must
   maintain the total order of events to be consumed by multiple clients.
2. TTL is implemented by Stream retention where only old Stream segments
   are automatically deleted (or archived in the future).
3. same applies to maxlen

Although messages cannot be dead lettered **from** a stream, messages can be dead lettered
**into** a stream. This commit provides clients consuming from a stream the death history: #11173

Additionally, this commit provides AMQP 1.0 clients the death history via
message annotation `x-opt-deaths` which contains the same information as
AMQP 0.9.1 header `x-death`.

Both, storing the death history in a stream and providing death history
to an AMQP 1.0 client, use the same encoding: a message annoation
`x-opt-deaths` that contains an array of maps ordered by death recency.
The information encoded is the same as in the AMQP 0.9.1 x-death header.

Instead of providing an array of maps, a better approach could be to use
an array of a custom AMQP death type, such as:
```xml
<amqp name="rabbitmq">
    <section name="custom-types">
        <type name="death" class="composite" source="list">
            <descriptor name="rabbitmq:death:list" code="0x00000000:0x000000255"/>
            <field name="queue" type="string" mandatory="true" label="the name of the queue the message was dead lettered from"/>
            <field name="reason" type="symbol" mandatory="true" label="the reason why this message was dead lettered"/>
            <field name="count" type="ulong" default="1" label="how many times this message was dead lettered from this queue for this reason"/>
            <field name="time" mandatory="true" type="timestamp" label="the first time when this message was dead lettered from this queue for this reason"/>
            <field name="exchange" type="string" default="" label="the exchange this message was published to before it was dead lettered for the first time from this queue for this reason"/>
            <field name="routing-keys" type="string" default="" multiple="true" label="the routing keys this message was published with before it was dead lettered for the first time from this queue for this reason"/>
            <field name="ttl" type="milliseconds" label="the time to live of this message before it was dead lettered for the first time from this queue for reason ‘expired’"/>
        </type>
    </section>
</amqp>
```

However, encoding and decoding custom AMQP types that are nested within
arrays which in turn are nested within the message annotation map can be
difficult for clients and the broker. Also, each client will need to
know the custom AMQP type. For now, therefore we use an array of maps.

  ## Feature flag
The new way to record death information is done via mc annotation
`deaths_v2`.
Because old nodes do not know this new annotation, recording death
information via mc annotation `deaths_v2` is hidden behind a new feature
flag `message_containers_deaths_v2`.

If this feature flag is disabled, a message will continue to use the
3.13.0 - 3.13.2 way to record death information in mc annotation
`deaths`, or even the older way within `x-death` header directly if
feature flag message_containers is also disabled.

Only if feature flag `message_containers_deaths_v2` is enabled and this
message hasn't been dead lettered before, will the new mc annotation
`deaths_v2` be used.

(cherry picked from commit 6b300a2)

# Conflicts:
#	deps/rabbit/app.bzl
#	deps/rabbit/src/mc_amqp.erl
#	deps/rabbit/src/rabbit_core_ff.erl
#	deps/rabbit/test/amqp_client_SUITE.erl

* Fix conflicts and failing tests

Extend message_containers_deaths_v2_SUITE to send 3 messages whose death
histories will be stored in 3 different ways:
1. with feature flag message_containers disabled
2. with feature flag message_containers enabled, but message_containers_deaths_v2 disabled
3. with feature flag message_containers_deaths_v2 enabled

---------

Co-authored-by: David Ansari <[email protected]>
  • Loading branch information
mergify[bot] and ansd authored May 14, 2024
1 parent 4735e01 commit 76c7910
Show file tree
Hide file tree
Showing 16 changed files with 750 additions and 270 deletions.
9 changes: 9 additions & 0 deletions deps/amqp10_common/src/amqp10_binary_generator.erl
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,7 @@ constructor(timestamp) -> <<16#83>>;
constructor(uuid) -> <<16#98>>;
constructor(null) -> <<16#40>>;
constructor(boolean) -> <<16#56>>;
constructor(map) -> 16#d1; % use large map type for all array elements
constructor(array) -> <<16#f0>>; % use large array type for all nested arrays
constructor(utf8) -> <<16#b1>>;
constructor({described, Descriptor, Primitive}) ->
Expand All @@ -203,6 +204,14 @@ generate(ulong, {ulong, V}) -> <<V:64/unsigned>>;
generate(long, {long, V}) -> <<V:64/signed>>;
generate({described, D, P}, {described, D, V}) ->
generate(P, V);
generate(map, {map, KvList}) ->
Count = length(KvList) * 2,
Compound = lists:map(fun({Key, Val}) ->
[(generate(Key)),
(generate(Val))]
end, KvList),
S = iolist_size(Compound),
[<<(S + 4):32, Count:32>>, Compound];
generate(array, {array, Type, List}) ->
Count = length(List),
Body = iolist_to_binary([constructor(Type),
Expand Down
1 change: 1 addition & 0 deletions deps/amqp10_common/src/amqp10_binary_parser.erl
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ parse_constructor(16#80) -> ulong;
parse_constructor(16#81) -> long;
parse_constructor(16#40) -> null;
parse_constructor(16#56) -> boolean;
parse_constructor(16#d1) -> map;
parse_constructor(16#f0) -> array;
parse_constructor(0) -> described;
parse_constructor(X) ->
Expand Down
15 changes: 13 additions & 2 deletions deps/amqp10_common/test/binary_generator_SUITE.erl
Original file line number Diff line number Diff line change
Expand Up @@ -168,11 +168,22 @@ array(_Config) ->
?assertEqual({array, boolean, [true, false]},
roundtrip_return({array, boolean, [{boolean, true}, {boolean, false}]})),

% array of arrays
% TODO: does the inner type need to be consistent across the array?
%% array of arrays
roundtrip({array, array, []}),
roundtrip({array, array, [{array, symbol, [{symbol, <<"ANONYMOUS">>}]}]}),

%% array of maps
roundtrip({array, map, []}),
roundtrip({array, map, [{map, [{{symbol, <<"k1">>}, {utf8, <<"v1">>}}]},
{map, []},
{map, [{{described,
{utf8, <<"URL">>},
{utf8, <<"http://example.org/hello-world">>}},
{byte, -1}},
{{int, 0}, {ulong, 0}}
]}
]}),

Desc = {utf8, <<"URL">>},
roundtrip({array, {described, Desc, utf8},
[{described, Desc, {utf8, <<"http://example.org/hello">>}}]}),
Expand Down
8 changes: 7 additions & 1 deletion deps/rabbit/BUILD.bazel
Original file line number Diff line number Diff line change
Expand Up @@ -386,7 +386,13 @@ rabbitmq_integration_suite(
additional_beam = [
":test_queue_utils_beam",
],
shard_count = 7,
shard_count = 8,
)

rabbitmq_integration_suite(
name = "message_containers_deaths_v2_SUITE",
size = "medium",
shard_count = 1,
)

rabbitmq_integration_suite(
Expand Down
9 changes: 9 additions & 0 deletions deps/rabbit/app.bzl
Original file line number Diff line number Diff line change
Expand Up @@ -2153,3 +2153,12 @@ def test_suite_beam_files(name = "test_suite_beam_files"):
erlc_opts = "//:test_erlc_opts",
deps = ["//deps/amqp_client:erlang_app"],
)
erlang_bytecode(
name = "message_containers_deaths_v2_SUITE_beam_files",
testonly = True,
srcs = ["test/message_containers_deaths_v2_SUITE.erl"],
outs = ["test/message_containers_deaths_v2_SUITE.beam"],
app_name = "rabbit",
erlc_opts = "//:test_erlc_opts",
deps = ["//deps/amqp_client:erlang_app", "//deps/rabbitmq_ct_helpers:erlang_app"],
)
36 changes: 22 additions & 14 deletions deps/rabbit/include/mc.hrl
Original file line number Diff line number Diff line change
@@ -1,17 +1,3 @@
-type death_key() :: {SourceQueue :: rabbit_misc:resource_name(), rabbit_dead_letter:reason()}.
-type death_anns() :: #{first_time := non_neg_integer(), %% the timestamp of the first
last_time := non_neg_integer(), %% the timestamp of the last
ttl => OriginalExpiration :: non_neg_integer()}.
-record(death, {exchange :: OriginalExchange :: rabbit_misc:resource_name(),
routing_keys = [] :: OriginalRoutingKeys :: [rabbit_types:routing_key()],
count = 0 :: non_neg_integer(),
anns :: death_anns()}).

-record(deaths, {first :: death_key(),
last :: death_key(),
records = #{} :: #{death_key() := #death{}}}).


%% good enough for most use cases
-define(IS_MC(Msg), element(1, Msg) == mc andalso tuple_size(Msg) == 5).

Expand All @@ -26,3 +12,25 @@
-define(ANN_RECEIVED_AT_TIMESTAMP, rts).
-define(ANN_DURABLE, d).
-define(ANN_PRIORITY, p).

-define(FF_MC_DEATHS_V2, message_containers_deaths_v2).

-type death_key() :: {SourceQueue :: rabbit_misc:resource_name(), rabbit_dead_letter:reason()}.
-type death_anns() :: #{%% timestamp of the first time this message
%% was dead lettered from this queue for this reason
first_time := pos_integer(),
%% timestamp of the last time this message
%% was dead lettered from this queue for this reason
last_time := pos_integer(),
ttl => OriginalTtlHeader :: non_neg_integer()}.

-record(death, {exchange :: OriginalExchange :: rabbit_misc:resource_name(),
routing_keys :: OriginalRoutingKeys :: [rabbit_types:routing_key(),...],
%% how many times this message was dead lettered from this queue for this reason
count :: pos_integer(),
anns :: death_anns()}).

-record(deaths, {first :: death_key(), % redundant to mc annotations x-first-death-*
last :: death_key(), % redundant to mc annotations x-last-death-*
records :: #{death_key() := #death{}}
}).
173 changes: 100 additions & 73 deletions deps/rabbit/src/mc.erl
Original file line number Diff line number Diff line change
Expand Up @@ -32,9 +32,8 @@
convert/3,
protocol_state/1,
prepare/2,
record_death/3,
record_death/4,
is_death_cycle/2,
last_death/1,
death_queue_names/1
]).

Expand Down Expand Up @@ -339,89 +338,103 @@ protocol_state(BasicMsg) ->
mc_compat:protocol_state(BasicMsg).

-spec record_death(rabbit_dead_letter:reason(),
SourceQueue :: rabbit_misc:resource_name(),
state()) -> state().
rabbit_misc:resource_name(),
state(),
environment()) -> state().
record_death(Reason, SourceQueue,
#?MODULE{protocol = _Mod,
data = _Data,
annotations = Anns0} = State)
when is_atom(Reason) andalso is_binary(SourceQueue) ->
#?MODULE{annotations = Anns0} = State,
Env)
when is_atom(Reason) andalso
is_binary(SourceQueue) ->
Key = {SourceQueue, Reason},
#{?ANN_EXCHANGE := Exchange,
?ANN_ROUTING_KEYS := RoutingKeys} = Anns0,
Timestamp = os:system_time(millisecond),
Ttl = maps:get(ttl, Anns0, undefined),

ReasonBin = atom_to_binary(Reason),
DeathAnns = rabbit_misc:maps_put_truthy(ttl, Ttl, #{first_time => Timestamp,
last_time => Timestamp}),
case maps:get(deaths, Anns0, undefined) of
undefined ->
Ds = #deaths{last = Key,
first = Key,
records = #{Key => #death{count = 1,
exchange = Exchange,
routing_keys = RoutingKeys,
anns = DeathAnns}}},
Anns = Anns0#{<<"x-first-death-reason">> => ReasonBin,
DeathAnns = rabbit_misc:maps_put_truthy(
ttl, Ttl, #{first_time => Timestamp,
last_time => Timestamp}),
NewDeath = #death{exchange = Exchange,
routing_keys = RoutingKeys,
count = 1,
anns = DeathAnns},
Anns = case Anns0 of
#{deaths := Deaths0} ->
Deaths = case Deaths0 of
#deaths{records = Rs0} ->
Rs = maps:update_with(
Key,
fun(Death) ->
update_death(Death, Timestamp)
end,
NewDeath,
Rs0),
Deaths0#deaths{last = Key,
records = Rs};
_ ->
%% Deaths are ordered by recency
case lists:keytake(Key, 1, Deaths0) of
{value, {Key, D0}, Deaths1} ->
D = update_death(D0, Timestamp),
[{Key, D} | Deaths1];
false ->
[{Key, NewDeath} | Deaths0]
end
end,
Anns0#{<<"x-last-death-reason">> := atom_to_binary(Reason),
<<"x-last-death-queue">> := SourceQueue,
<<"x-last-death-exchange">> := Exchange,
deaths := Deaths};
_ ->
Deaths = case Env of
#{?FF_MC_DEATHS_V2 := false} ->
#deaths{last = Key,
first = Key,
records = #{Key => NewDeath}};
_ ->
[{Key, NewDeath}]
end,
ReasonBin = atom_to_binary(Reason),
Anns0#{<<"x-first-death-reason">> => ReasonBin,
<<"x-first-death-queue">> => SourceQueue,
<<"x-first-death-exchange">> => Exchange,
<<"x-last-death-reason">> => ReasonBin,
<<"x-last-death-queue">> => SourceQueue,
<<"x-last-death-exchange">> => Exchange
},

State#?MODULE{annotations = Anns#{deaths => Ds}};
#deaths{records = Rs} = Ds0 ->
Death = #death{count = C,
anns = DA} = maps:get(Key, Rs,
#death{exchange = Exchange,
routing_keys = RoutingKeys,
anns = DeathAnns}),
Ds = Ds0#deaths{last = Key,
records = Rs#{Key =>
Death#death{count = C + 1,
anns = DA#{last_time => Timestamp}}}},
Anns = Anns0#{deaths => Ds,
<<"x-last-death-reason">> => ReasonBin,
<<"x-last-death-queue">> => SourceQueue,
<<"x-last-death-exchange">> => Exchange},
State#?MODULE{annotations = Anns}
end;
record_death(Reason, SourceQueue, BasicMsg) ->
mc_compat:record_death(Reason, SourceQueue, BasicMsg).
<<"x-last-death-exchange">> => Exchange,
deaths => Deaths}
end,
State#?MODULE{annotations = Anns};
record_death(Reason, SourceQueue, BasicMsg, Env) ->
mc_compat:record_death(Reason, SourceQueue, BasicMsg, Env).

update_death(#death{count = Count,
anns = DeathAnns} = Death, Timestamp) ->
Death#death{count = Count + 1,
anns = DeathAnns#{last_time := Timestamp}}.

-spec is_death_cycle(rabbit_misc:resource_name(), state()) -> boolean().
is_death_cycle(TargetQueue, #?MODULE{annotations = #{deaths := #deaths{records = Rs}}}) ->
is_cycle_v1(TargetQueue, maps:keys(Rs));
is_death_cycle(TargetQueue, #?MODULE{annotations = #{deaths := Deaths}}) ->
is_cycle(TargetQueue, maps:keys(Deaths#deaths.records));
is_cycle_v2(TargetQueue, Deaths);
is_death_cycle(_TargetQueue, #?MODULE{}) ->
false;
is_death_cycle(TargetQueue, BasicMsg) ->
mc_compat:is_death_cycle(TargetQueue, BasicMsg).

%% Returns death queue names ordered by recency.
-spec death_queue_names(state()) -> [rabbit_misc:resource_name()].
death_queue_names(#?MODULE{annotations = Anns}) ->
case maps:get(deaths, Anns, undefined) of
undefined ->
[];
#deaths{records = Records} ->
proplists:get_keys(maps:keys(Records))
end;
death_queue_names(#?MODULE{annotations = #{deaths := #deaths{records = Rs}}}) ->
proplists:get_keys(maps:keys(Rs));
death_queue_names(#?MODULE{annotations = #{deaths := Deaths}}) ->
lists:map(fun({{Queue, _Reason}, _Death}) ->
Queue
end, Deaths);
death_queue_names(#?MODULE{}) ->
[];
death_queue_names(BasicMsg) ->
mc_compat:death_queue_names(BasicMsg).

-spec last_death(state()) ->
undefined | {death_key(), #death{}}.
last_death(#?MODULE{annotations = Anns})
when not is_map_key(deaths, Anns) ->
undefined;
last_death(#?MODULE{annotations = #{deaths := #deaths{last = Last,
records = Rs}}}) ->
{Last, maps:get(Last, Rs)};
last_death(BasicMsg) ->
mc_compat:last_death(BasicMsg).

-spec prepare(read | store, state()) -> state().
prepare(For, #?MODULE{protocol = Proto,
data = Data} = State) ->
Expand All @@ -431,24 +444,38 @@ prepare(For, State) ->

%% INTERNAL

%% if there is a death with a source queue that is the same as the target
is_cycle_v2(TargetQueue, Deaths) ->
case lists:splitwith(fun({{SourceQueue, _Reason}, #death{}}) ->
SourceQueue =/= TargetQueue
end, Deaths) of
{_, []} ->
false;
{L, [H | _]} ->
%% There is a cycle, but we only want to drop the message
%% if the cycle is "fully automatic", i.e. without a client
%% expliclity rejecting the message somewhere in the cycle.
lists:all(fun({{_SourceQueue, Reason}, _Death}) ->
Reason =/= rejected
end, [H | L])
end.

%% The desired v1 behaviour is the following:
%% "If there is a death with a source queue that is the same as the target
%% queue name and there are no newer deaths with the 'rejected' reason then
%% consider this a cycle
is_cycle(_Queue, []) ->
%% consider this a cycle."
%% However, the correct death order cannot be reliably determined in v1.
%% v2 fixes this bug.
is_cycle_v1(_Queue, []) ->
false;
is_cycle(_Queue, [{_Q, rejected} | _]) ->
is_cycle_v1(_Queue, [{_Q, rejected} | _]) ->
%% any rejection breaks the cycle
false;
is_cycle(Queue, [{Queue, Reason} | _])
is_cycle_v1(Queue, [{Queue, Reason} | _])
when Reason =/= rejected ->
true;
is_cycle(Queue, [_ | Rem]) ->
is_cycle(Queue, Rem).
is_cycle_v1(Queue, [_ | Rem]) ->
is_cycle_v1(Queue, Rem).

set_received_at_timestamp(Anns) ->
Millis = os:system_time(millisecond),
Anns#{?ANN_RECEIVED_AT_TIMESTAMP => Millis}.

-ifdef(TEST).
-include_lib("eunit/include/eunit.hrl").
-endif.
Loading

0 comments on commit 76c7910

Please sign in to comment.