Celery is a great task queue library for Python.
This guide is for Celery v 4.1.0
There are some important settings for celery users on CloudAMQP, especially for users on shared instances with limited connections and number of messages per month.
import os broker_url = os.environ.get('CLOUDAMQP_URL') broker_pool_limit = 1 # Will decrease connection usage broker_heartbeat = None # We're using TCP keep-alive instead broker_connection_timeout = 30 # May require a long timeout due to Linux DNS timeouts etc result_backend = None # AMQP is not recommended as result backend as it creates thousands of queues event_queue_expires = 60 # Will delete all celeryev. queues without consumers after 1 minute. worker_prefetch_multiplier = 1 # Disable prefetching, it's causes problems and doesn't help performance worker_concurrency = 50 # If you tasks are CPU bound, then limit to the number of cores, otherwise increase substainally
This will decrease the message rates substantially. Without these flags Celery will send hundreds of messages per second with different diagnostic and redundant heartbeat messages. Read more about --without-gossip , --without-mingle and --without-heartbeat in the Celery Worker documentation.
Set the broker_pool_limit to 1 and make sure that your webserver spawns less processes/workers than your current CloudAMQP plan allows for concurrent connections.
Each process creates its own connection pool, so the formula for calculating your concurrent connection need is:
broker_pool_limit * (web dynos * web workers + worker dynos * worker concurrency)
So make sure that you limit the number of
gunicorn web workers
flag and worker concurrency with
We recommend to disable heartbeats . We've enabled low TCP keep-alive intervals on all our RabbitMQ servers so that stale connections will be detected on the TCP level instead of in the application layer.
broker_heartbeat = None
We recommend to set
is the default timeout in seconds before it give up establishing a connection to the CloudAMQP server.
broker_connection_timeout = 30
By default celery is configured not to consume task results.
If you do want the task results and you want to
use RabbitMQ to store them, then use
result_backend = 'rpc'.
Read more about different backends here: http://docs.celeryproject.org/en/latest/userguide/configuration.html#result-backend
NOTE: We highly advise against using the deprecated
result_backend = 'amqp'
since it might end up consuming all memory on your instance.
By default celery doesn't send task event, but if you want to use a monitor tool for celery, like Flower, this must be enable.
Enable it with these settings
worker_send_task_events = True
event_queue_expires = 60
With event_queue_expires set to 60 seconds all celeryev. queues will be deleted after 1 minute - if they do not have any consumers.
Celerys way of doing prefetching (dynamic and globally) is very not very efficent, and actually causes problems quite often. We recommend limiting prefetching to 1, and only let each worker get one message at a time. This can only pose a problem if the latency between the broker and the worker is very high and the throughput very high, only then may you want to increase it. Read more at Celery's prefetch documentation.
This is how many process/threads/green-threads that should process tasks at the same time. If your workload is CPU bound then limit it to the number of cores you got (this is the default), more will only slightly decrease the performance. But if you're doing I/O, i.e. doing outgoing HTTP requests, talking to a database or any other external service, then you can increase it a lot, and gain a lot of performance, 200-500 is not unheard of.
With the setting
all tasks will
be executed locally by blocking until the task returns. That is, tasks
will be executed locally instead of being sent to the queue.
This forces all calls to .delay()/.apply_async() that would normally
get delegated to the worker to instead execute synchronously.
You can use this setting when migrating between dedicated plans:
heroku addons:add cloudamqp:bunny
Flower is a web based tool for real-time monitoring and administrating Celery clusters (it is still under development). From Flower it is possible to overview task progress and history, show task details and graphs and statistics about the tasks. You can overview scheduled tasks, revoke or terminate tasks and much more.
Flower uses the RabbitMQ Managment Plugin to get info about queues. To get the broker tab working with CloudAMQP you need to set the --broker_api flag to the URL of the RabbitMQ HTTP API including user credentials, see example below. Broker API port should be 443.