Softonic is the world's largest software and app discovery destination and one of the world's most highly-trafficked websites. You have probably landed on their website when you need to download something - and you are not the only one. Over 100 million users are reaching Softonic - per month. It’s an app guide that helps you discover the best applications for your device, offering you reviews, news, articles and free downloads.
CloudAMQP provides hosted RabbitMQ clusters, in all the biggest data centers all over the world and Softonic is one of our many customers. We met up with Riccardo Piccoli, a developer at Softonic, at the RabbitMQ Summit 2018 in London where he kindly shared Softonic’s customer story with us.
This article is broken down into two pieces; the first part is an overview of the system, which shows a simple RabbitMQ use cases of an event-based architecture. The second part is a deep-dive into the internal architecture in Softonic - plugins they are using and examples of events they are sending.
A simple RabbitMQ use case
Users can upload files to Softonic. All uploaded files are scanned for virus and information about the file is collected, before the file is distributed to other users. The new binary data is, first of all, persisted within a dedicated service, and a notification about the upload is sent to an event bus. Other services collect this information which in the end will be added to the website. In this case, the user gets notified straight after the upload has succeeded and a scanning event is simply placed on an event-bus for other services to handle. An event bus, also called message queue, allows web servers to respond to requests quickly instead of being forced to perform a resource-heavy process on the spot, and instead of keeping the user waiting.
The scanning process is one of those services. The virus scanning application takes a message of the event bus, such as a “ScanFile” command and starts the processing of the file. At the same time, other users are able to upload new files to Softonic and processing tasks are just piling up in the queue. The event “FileScanned” is added back to the event-bus, once the consuming application has handled the event.
An architecture like this creates two simple applications and low coupling between the sender and the receiver. Users can still upload files, even if the scanning application is busy or is under maintenance.
- Different events or commands are published to the event bus, e.g., a “ScanFile” command.
- Softonic is using RabbitMQ as an event bus, events or commands are simply added to the queue.
- The consuming application retrieves the event and starts to process the event. Some data is stored to the database, and more events might be published back to another event queue (more about this in “Internal Structure of RabbtiMQ”).
- The consuming application stores lot’s of information in a database (MySQL).
When a microservice receives an event, it can update its own business entities, which might lead to more events being published, and that is exactly the case here.
Internal Structure of RabbitMQ
It’s time for a deep-dive into the internal architecture of RabbitMQ, and into the Softonic Application. Two RabbitMQ concepts need to be described if you are not already familiar with them. Softonic is using the consistent hash exchange plugin and RabbitMQ sharding.
Image description external usage: The Softonic services are built upon Node.js and PHP and communicate with the RabbitMQ event bus, from which information from the services go by a PHP application to a MySQL event store.
Image description internal usage: Information from the first application fetches data from MySQL Event Share and pushes it through consistent hash exchanges in two internal RabbitMQ event buses (where sharded queues are used), from there the information reaches the orchestration layer and an elasticsearch cluster, where it, in the end, gets visible for users.
The consistent hash exchange plugin
The consistent hash exchange plugin load-balance messages between queues. Messages sent to the exchange are consistently and equally distributed across many queues, based on the routing key of the message. The plugin creates a hash of the routing key and spread the messages out between queues that have a binding to that exchange. It could quickly become problematically to do this manually, without adding too much information about numbers of queues and their bindings into the publisher.
Note that it’s important to consume from all bunded queues when using this plugin. Read more about the consistent hash exchange plugin here.
The RabbitMQ sharding plugin does the partitioning of queues automatically for you, i.e., once you define an exchange as sharded, the supporting queues are automatically created on every cluster node and messages are sharded across them. RabbitMQ sharding shows one queue to the consumer, but it could be many queues running behind it in the background. The RabbitMQ Sharding plugin gives a centralized place to where you can send your messages, plus load balancing across many nodes, by adding queues to the other nodes in the cluster.
Read more about RabbitMQ Sharding here.
An example sequence of Softonic events and commands
Below is an example of events and commands sent via RabbitMQ. The consistent hash exchange is used; event 1 and 2 end up in the same queue (preserving order) while event 3 not necessarily ends up in the same queue. Data is sharded, and processed with consistent hashing F(id_program) in order to preserve order by program.
Event 0: Create category “antivirus” (name: “antivirus”)
Event 1: Create program A (name "foobar", category "antivirus", developer "softonic")
Event 2: Create review for program A
Event 3: Create program B (name "foo", category "antivirus", developer "84codes")
Event 4: Update category "antivirus" name to "Antivirus"
In this example, event 0 and event 4 needs to be processed synchronously, while event 1, 2, and 3 can be processed asynchronously Event 0 will be processed straight away, event 1,2 and 3 will be re-published to the queue so that other sharded consumers can process them.
CloudAMQP - Message queuing as a Service
Softonic did run RabbitMQ in-house before they were moving into the cloud. The biggest reason for choosing CloudAMQP as a provider was because of the simplicity to install RabbitMQ without the hassle of maintaining a RabbitMQ cluster.
CloudAMQP offers many different plans, for various usage and it’s possible to try out the plan named Little Lemur for free.
The CloudAMQP team are grateful for the chat we get to have with Softonic and we're truly impressed by your success. We wish you the best of luck. A special thanks to Riccardo for your time at the RabbitMQ Summit 2018, hope to see you again on next event.
Chat about RabbitMQ Summit 2018
Riccardo and 130 other people within the RabbitMQ community gathered to upgrade and exchange knowledge and experiences about RabbitMQ, therefore we also had some time talking about the RabbitMQ Summit. Riccardo liked the high level, and he was really looking forward to Quorum Queues.
RabbitMQ Quorum Queues
Quorum Queues was one hot topic under RabbitMQ Summit and was brought to the audience in a talk by RabbitMQ core developer Michael Klishin. Quorum Queues is based on the Raft consensus algorithm, and will in many senses be an upgrade from the RabbitMQ Mirrored Queues. Quorum Queues is being welcomed by both RabbitMQ developers and users, and we will, of course, try it out really soon and get back with a review.
Other Customer Stories from CloudAMQP