Monitoring Tools

CloudAMQP offers various monitoring tools. These tools will address performance issues promptly and automatically, before they impact your business. CloudAMQP monitoring include diagrams for CPU and Memory usage. It is possible to activate alarms to be triggered when a part of the system is heavily used and it is easy to view the RabbitMQ log stream direcly in CloudAMQP.

Alarms

Receive accurate alerts based on performance anomalies in your application. It is possible to activate Queue Alarms, Consumers alarm, CPU Alarms and Memory alarms.

Alarms can be sent to email addresses or as push notifications to webhooks. When an alarm is triggerd will an alert be sent to everyone specified in the notifications list. You need to setup a notification before any of the alarms will work correctly. More information about payload sent when using webhook can be found here.

NOTE: The admin user must have access to all vhosts on the server for the queue and consumer alarms to work correctly.

rabbitmq alarm notifications

Queue Alarms

Queue alarms can be activated for all instances.

Queue alarms can be triggered to send notifications when a number of messages in a queue reaches a certain threshold, for a given amount of time.

rabbitmq queue alarms

When a queue is matching the given regexp and when that queue have had more than the value threshold (more then x messages in the queue) for more than number of seconds the alarm will trigger.

Example: The regexp .* will match all your queues. A regexp like ^myqueue$ would match exactly the queue named "myqueue". Use http://rubular.com/ to test your regex.

Consumers Alarms

Consumers alarms can be activated for all instances.

Consumers alarms can be triggered to send notifications when number of consumers for a queue is less then or equal to a given number of consumers, for a given amount of time.

rabbitmq consumer alarms

Connection Alarms

Connection alarms can be triggered to send notifications when number of connections is greated than or equal to a given number of connections, fora given amount of time.

rabbitmq connection alarms

CPU Alarms

CPU Alarms are only available for dedicated instances.

When CPU alarm is enabled will you receive a notification when you are using more then 80% of the available CPU for more then 15 minutes. One alarm email will be sent to you and your co-workers every 6 hours until the problem is solved. It is recommended to enable CPU alarms. It can be enabled through the Control Panel. CPU alarms

Memory Alarms

Memory Alarms are only available for dedicated instances.

When Memory alarm is enabled will you receive a notification when you are using more then 90% of the available memory for more then 15 minutes. One alarm email will be sent to you and your co-workers every 6 hours until the problem is solved. It is recommended to enable Memory alarms. It can be enabled through the Control Panel.

rabbitmq memory alarms

Server metrics

Serve Metrics helps you to measure performance metrics from your server. CloudAMQP shows CPU Usage Monitoring and Memory Usage Monitoring.

CPU Usage

CPU Usage referes to how much work your processor is doing.

  • I/O Wait:
    Show percentage of time spent by the CPU waiting for a IO (input/output) operation to complete, the percentage of time the CPU have to wait on the disk. If this is high should you consider if more message can be published as transient instead of persistent or make sure that your queues are short so that messages don't have to be written to disk. You can also contact us at support@cloudamqp.com to discuss other solutions.
  • User time:

    Show percentage of time your program spends executing instructions in the CPU. In this case, the time the CPU spent running RabbitMQ.

    If this is high it probably means you are on the limit of what your server can handle. You should consider upgrading before lack of CPU power becomes a serious issue.

  • System time:

    Describes percentage of time the CPU spent running OS tasks.

  • Steal time:

    Percentage of CPU time "stolen" by the virtualization system - time spent when the virtual CPU waits for a real CPU. If this is high does it mean that you are using to much CPU power. This can seriously impacting the performance of your server. You should probably upgrade to a larger instance.

metrics

Memory Usage

  • Used: Percentage of used memory.
  • Free: Percentage of free memory.
rabbitmq memory usage

RabbitMQ Log Stream

RabbitMQ Log Stream show a live log from RabbitMQ.

rabbitmq log stream

Integrated monitoring services

Integrated monitoring services are only available for dedicated instances.

CloudAMQP is integrated to log and metric servies: CloudWatch, DataDog, Librato, Loggly, Papertrail and Logentries. Read more about monitoring services here.

Event stream

Event stream is only available for dedicated instances.

The event stream allows you to see the latest 1000 events from your RabbitMQ cluster. New events will be added to the collection in real time.

The event stream can be useful when you need to gain insight into what is happening in your cluster. It is particularly good to debug if you are running into high CPU-usage, for instance if you have rapid opening or closing of connections or channels or a setup for shovel that is not working, etcetera.

This is a new feature still in active development and we welcome feedback on how to improve the experience.

Notifications payload - webhooks

Alarm notifications can be received via webhooks. This section describe the content payload that is being sent to you in each POST.

  • type: Type of the alarm, including one of: queue, consumer, cpu, memory, disk, connection, netsplit_join and netsplit_split
  • appname: Name of the instance that triggered the alarm
  • hostname: Hostname of the instance that triggered the alarm
  • threshold: Value threshold specified for the alarm
  • vhost_regexp: Regexp for the vhost
  • regexp: Regexp of your specified alarm (e.g. Queue regexp)
  • time_until_fire: Time threshold specified for the alarm
  • options: Could include extra information about the alarm. Like message_type (messages_unacknowledge, messages_ready, messages)
  • account_id: Account id of the instance that triggered the alarm