How to Avoid the Pitfalls of the Delayed Message Exchange

The use of delayed message exchange turned out to be quite useful, but not without its quirks. This article will help you understand how delayed message exchange works so you can avoid potential pitfalls.

The RabbitMQ delayed exchange plugin is used to implement a wait time between when a message reaches the exchange and when it is delivered to a queue. Every time a message is published, an offset in milliseconds can be specified.

Don't lose delayed messages

Although DME queues up messages, it lacks many features that real queues have including the level of observability that queues have. For instance, you can only see the number of currently stored messages if you hover over the `DM` feature mark, but you don’t get that nice graph that shows the trend in a given time window. There is also no way to know the amount of memory DME consumes due to the fact that it only stores messages on the node where the exchange is created, so there is no high availability. Further, if that node is gone, all the delayed messages go with it.

Mnesia as message store for delayed messages

Keep in mind that delayed message exchange does not use the same message store that the queues use, which is well optimised for storing a large number of messages. Instead, it uses Mnesia, a general-purpose DB that is included in the Erlang standard library and runs inside RabbitMQ. The Mnesia table used for the delayed message exchange has all its entries stored both in memory and on disk. Unlike for classic queues, there is no mechanism that pages out messages from memory to disk to limit memory usage. This means that if there are a lot of delayed messages, the RAM can potentially become full.

This also matters when you restart RabbitMQ because it will load the delayed messages table from the disk, which can take a while if there are a lot of messages. By default, RabbitMQ will wait 30 seconds and after that, it restarts the node.

Thousands of messages scheduled for the exact same timestamp

Some more implementation details lead us to more edge cases. The key of the delayed messages table is the timestamp when the message should trigger and the exchange name. We've seen cases where thousands of messages were scheduled for the exact same timestamp, which means that they are stored under the same key. This is not a scenario Mnesia is optimised for so it is even more expensive to load such a table at plugin startup. As a workaround you can add a bit of salt to the delay, so the messages will be delivered a bit before or after the schedule. When you schedule a few days in advance, a few seconds or milliseconds of discrepancy does not matter anyway.

Keep the number of delayed messages to a reasonably low number

If you are in a situation where there are a lot of delayed messages cumulated you can temporarily set config mnesia_table_loading_retry_timeout to a sufficiently high value to allow RabbitMQ to start up properly. However, the best practice is to keep the number of delayed messages to a reasonably low number at any time.

Max delay time

In Erlang a timer can be set up to (2^32)-1 milliseconds, meaning that messages with a longer delay than the max delay time (2^32 -1), around 49 days, gets delivered immediately.

Email us at support@cloudamqp.com if you have any questions!