Queues in RabbitMQ are great! They anchor the communication between producers and consumers. Replicated queues even orchestrate communication with reliability and data safety. However, there are scenarios where queues fall flat or crawl on their knees. What scenarios?
Queues are limited in the following scenarios:
- They deliver the same message to multiple consumers by binding a dedicated queue for each consumer. Clearly, this could create a scalability problem.
- They erase read messages making it impossible to re-read(replay) them or grab a specific message in the queue.
- They perform poorly when dealing with millions of messages because they are optimized to gravitate toward an empty state.
The RabbitMQ team introduced Streams in RabbitMQ 3.9 to mitigate the above-listed challenges. But what are RabbitMQ Streams? This article will explore Streams in RabbitMQ, and how they solve the problems described above. The subsequent post in this series will also look at how to get started with Streams.
What are RabbitMQ Streams?
RabbitMQ Streams basically perform the same tasks as queues in that they buffer messages from producers that are read by consumers. However, Streams differ from queues in two ways:
- How producers write messages to them
- And how consumers read messages from them
Under the hood, Streams model an append-only log that's immutable. In this context, this means messages written to a Stream can't be erased, they can only be read. A more scholarly description would be to call this behavior of Streams “non-destructive consumer semantics”.
To read messages from a Stream in RabbitMQ, one or more consumers subscribe to it and read the same message as many times as they want. Additionally, Streams are always persistent and replicated.
Like queues, consumers talk to a Stream via AMQP-based clients and, by extension, use the AMQP protocol. Alternatively, consumers can also connect to a Stream via the binary stream protocol. The stream protocol fosters faster message flow when working with RabbitMQ Streams.
All these unique sets of characteristics compound to make Streams in RabbitMQ a dramatic shift from queues. The Stream wasn’t created to replace queues but to complement them. Streams open up endless possibilities for new RabbitMQ use cases like the scenarios identified earlier.
Let’s explore these use cases a little bit deeper.
When to Use RabbitMQ Streams
The use cases where streams shine include:
- Fan-out architectures: Where many consumers need to read the same message
- Replay & time-travel: Where consumers need to read and reread the same message or start reading from any point in the stream.
- Large Volumes of Messages: Streams are great for use cases where large volumes of messages need to be persisted.
- High Throughput: RabbitMQ Streams process relatively higher volumes of messages per second.
A fan-out architecture is where multiple consumers read the same message. As mentioned earlier, implementing this sort of architecture with queues isn’t optimal. Having to add queues for every added consumer is resource intensive, which gets worse when dealing with queues that need to persist data.
Streams in RabbitMQ make implementing fan-out architectures a breeze. Because consumers read messages from a Stream in a non-destructive manner, a message will always be there for the next consumer to access it. In essence, to implement a fan-out architecture, just declare a RabbitMQ Stream and bind as many consumers as needed.
The image below depicts what implementing a fan-out with a Stream would look like.
Conversely, trying to achieve the same thing with queues would look like what’s shown in the image below.
Replay & Time Travel
RabbitMQ Streams are also fit for use cases where a consumer needs to re-read the same message, something that isn't possible with queues. Aside from re-reading messages, it is also possible to start consuming messages from any point in the Stream.
This easy replay and time-travel feature of Streams is made possible with offsets. Offsets are to Streams what indexes are to arrays or keys to hash maps. To start consuming messages from a specific point/index in a Stream, just specify an offset in the consumer query. Essentially, every message in a Streamm is associated with an offset.
For example, this is what messages and their corresponding offsets would look like in a given Stream.
Large Volumes of Messages
RabbitMQ Streams are perfect when persisting large volumes of messages. Streams shine in this area because they store messages on the file system. As a result, a Stream in RabbitMQ could grow indefinitely until the host disk space runs out.
As running out of disk might not be a desirable behavior, RabbitMQ Streams allow setting a maximum log data size. When the upper limit is reached, the oldest messages are discarded preventing the Stream from consuming the entire disk space.
If a RabbitMQ use case requires processing high volumes of messages per second, then using a Stream is the best option.
In a talk at the 2021 RabbitMQ summit, Arnaud Cogoluègnes, A Staff Software Engineer at Vmware, pointed out how Quorum queues handle about 40,000 messages per second. Streams, on the other hand, handled around 64,000 messages per second when used with AMQP protocol, and over 1 million messages per second when used with native Stream protocol.
This high throughput is a testament to the simplicity of the Stream data structure and the Stream protocol itself. For example, since the Stream protocol doesn’t handle things like routing messages, de-queueing messages, etc., it technically does less work, and this translates to higher performance.
This is a basic understanding of how the streaming and replay features that come with Streams in RabbitMQ 3.9 open up possibilities for new RabbitMQ use cases. They make it easier to implement fan-out architectures and to read and re-read old messages, amongst other things.
The next part of this series dive into how to declare a stream, and how to publish and consume a message from the declared Stream. These tasks will be demonstrated in two ways, first using a RabbitMQ client library and then with the dedicated Streams protocol plugin.
Ready to start using RabbitMQ in your architecture? CloudAMQP is one of the world’s largest RabbitMQ cloud hosting providers. In addition to RabbitMQ, we also created our in-house message broker, LavinMQ with a throughput of around 1,000,000 messages/sec.
Email us at email@example.com with any suggestions, questions, or feedback.