[5897] 10 Dec 17:00:42.661 # Server started, Redis version 2.6.14 [5897] 10 Dec 17:00:42.661 * The server is now ready to accept connections on port 6379
# By default Redis does not run as a daemon. Use ‘yes‘ if you need it. # Note that Redis will write a pid file in /var/run/redis.pid when daemonized. daemonize yes
4). 测试redis,启动实例 [root@Architect redis-2.6.14]# redis-cli redis> set name songbin OK redis> get name "songbin"
在当前目录下编辑 # vim tasks.py #!/usr/bin/env python # File: task.py #
from time import sleep from celery import Celery backend = ‘redis://127.0.0.1:6379/0‘ broker = ‘redis://127.0.0.1:6379/1‘ app = Celery(‘tasks‘, backend=backend, broker=broker) @app.task def add(x, y): sleep(10) return x + y
3. 运行celelry worker
$ celery -A tasks worker --loglevel=info Running a worker with superuser privileges when the worker accepts messages serialized with pickle is a very bad idea!
If you really want to continue then you have to set the C_FORCE_ROOT environment variable (but please think about this before you do).
User information: uid=0 euid=0 gid=0 egid=0
出现这样的信息是表示redis的服务启动失败,处理办法: [root]$ export C_FORCE_ROOT="true" [root ]$ celery -A tasks worker --loglevel=debug celery -A tasks worker --loglevel=info /usr/local/python2.7.3/lib/python2.7/site-packages/celery/platforms.py:766: RuntimeWarning: You are running the worker with superuser privileges, which is absolutely not recommended!
Please specify a different user using the -u option.
User information: uid=0 euid=0 gid=0 egid=0
uid=uid, euid=euid, gid=gid, egid=egid, [2014-12-10 23:02:26,993: WARNING/MainProcess] /usr/local/python2.7.3/lib/python2.7/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It‘s important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2::
[2014-12-10 23:02:27,516: INFO/MainProcess] Connected to redis://127.0.0.1:6379/1 [2014-12-10 23:02:27,524: INFO/MainProcess] mingle: searching for neighbors [2014-12-10 23:02:29,074: INFO/MainProcess] mingle: all alone [2014-12-10 23:02:29,080: WARNING/MainProcess] celery@ltv_13 ready. 这就表示启动成功了
如果出下面的提示: [2014-12-11 16:04:08,223: WARNING/MainProcess] /usr/local/python2.7.3/lib/python2.7/site-packages/celery/apps/worker.py:161: CDeprecationWarning: Starting from version 3.2 Celery will refuse to accept pickle by default.
The pickle serializer is a security concern as it may give attackers the ability to execute any command. It‘s important to secure your broker from unauthorized access when using pickle, so we think that enabling pickle should require a deliberate action and not be the default choice.
If you depend on pickle then you should set a setting to disable this warning and to be sure that everything will continue working when you upgrade to Celery 3.2:: CELERY_ACCEPT_CONTENT = [‘pickle‘, ‘json‘, ‘msgpack‘, ‘yaml‘]
You must only enable the serializers that you will actually use. warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) 则表示需要在tasks中进行配置: # vim tasks.py #!/usr/bin/env python # File: task.py #
from time import sleep from celery import Celery backend = ‘redis://127.0.0.1:6379/0‘ broker = ‘redis://127.0.0.1:6379/1‘
测试代码: [root]$ cat test.py from tasks import add if __name__ == ‘__main__‘: for i in range(100): for j in range(100): kk=add.delay(i, j) kk.ready() kk.get()
[root]$ python ./test.py
可以在celelry worker看到消息被消费了 [2014-12-11 15:43:04,136: INFO/MainProcess] Received task: tasks.add[a0d1facd-39e8-44f6-9dd9-8980dbfca41b] [2014-12-11 15:43:14,138: INFO/MainProcess] Task tasks.add[a0d1facd-39e8-44f6-9dd9-8980dbfca41b] succeeded in 10.0008870028s: 0 [2014-12-11 15:43:14,638: INFO/MainProcess] Received task: tasks.add[6357f049-ae5a-4690-8ac7-2ff91b9d21c9] [2014-12-11 15:43:24,639: INFO/MainProcess] Task tasks.add[6357f049-ae5a-4690-8ac7-2ff91b9d21c9] succeeded in 10.0008919984s: 1 [2014-12-11 15:43:25,140: INFO/MainProcess] Received task: tasks.add[787039c5-bf6d-49e3-980b-912c0b743351] [2014-12-11 15:43:35,141: INFO/MainProcess] Task tasks.add[787039c5-bf6d-49e3-980b-912c0b743351] succeeded in 10.0006869994s: 2 [2014-12-11 15:43:35,642: INFO/MainProcess] Received task: tasks.add[71826656-1b25-425d-884d-423d642ad6fe] [2014-12-11 15:43:45,643: INFO/MainProcess] Task tasks.add[71826656-1b25-425d-884d-423d642ad6fe] succeeded in 10.000723999s: 3 [2014-12-11 15:43:46,144: INFO/MainProcess] Received task: tasks.add[eea8cbb3-c526-4c27-94b2-2cb1446b78f1] [2014-12-11 15:43:56,145: INFO/MainProcess] Task tasks.add[eea8cbb3-c526-4c27-94b2-2cb1446b78f1] succeeded in 10.0006980002s: 4 [2014-12-11 15:43:56,646: INFO/MainProcess] Received task: tasks.add[b04058d7-9ac1-4979-a4ce-eb262c9ad2a4] [2014-12-11 15:44:06,647: INFO/MainProcess] Task tasks.add[b04058d7-9ac1-4979-a4ce-eb262c9ad2a4] succeeded in 10.0008420013s: 5 [2014-12-11 15:44:07,148: INFO/MainProcess] Received task: tasks.add[ca5ebf48-591b-43dc-b542-a36a5bdc66b5] [2014-12-11 15:44:17,149: INFO/MainProcess] Task tasks.add[ca5ebf48-591b-43dc-b542-a36a5bdc66b5] succeeded in 10.0005079992s: 6 [2014-12-11 15:44:17,649: INFO/MainProcess] Received task: tasks.add[0ec250b1-07b5-4df6-a06e-94ad232d5e73] [2014-12-11 15:44:27,650: INFO/MainProcess] Task tasks.add[0ec250b1-07b5-4df6-a06e-94ad232d5e73] succeeded in 10.0003799982s: 7 ...
This document describes Celery’s uniform “Calling API” used by task instances and the canvas. The API defines a standard set of execution options, as well as three methods: . apply_async(args[, kwargs[, …]]) Sends a task message. . delay(*args, **kwargs) Shortcut to send a task message, but does not support execution options. . calling (__call__) Applying an object supporting the calling API (e.g. add(2, 2)) means that the task will be executed in the current process, and not by a worker (a message will not be sent).
2. Quick Cheat Sheet
. T.delay(arg, kwarg=value) always a shortcut to .apply_async. . T.apply_async((arg, ), {‘kwarg‘: value}) . T.apply_async(countdown=10) executes 10 seconds from now. . T.apply_async(eta=now + timedelta(seconds=10)) executes 10 seconds from now, specifed using eta . T.apply_async(countdown=60, expires=120) executes in one minute from now, but expires after 2 minutes. . T.apply_async(expires=now + timedelta(days=2)) expires in 2 days, set using datetime.
3.Example
The delay() method is convenient as it looks like calling a regular function: task.delay(arg1, arg2, kwarg1=‘x‘, kwarg2=‘y‘) Using apply_async() instead you have to write: task.apply_async(args=[arg1, arg2], kwargs={‘kwarg1‘: ‘x‘, ‘kwarg2‘: ‘y‘})
So delay is clearly convenient, but if you want to set additional execution options you have to use apply_async. The rest of this document will go into the task execution options in detail. All examples use a task called add, returning the sum of two arguments:
@app.task def add(x, y): return x + y
Tip If the task is not registered in the current process you can use send_task() to call the task by name instead.
There’s another way… You will learn more about this later while reading about the Canvas, but subtask‘s are objects used to pass around the signature of a task invocation, (for example to send it over the network), and they also support the Calling API: task.s(arg1, arg2, kwarg1=‘x‘, kwargs2=‘y‘).apply_async()
4. Linking (callbacks/errbacks)
Celery supports linking tasks together so that one task follows another. The callback task will be applied with the result of the parent task as a partial argument: add.apply_async((2, 2), link=add.s(16))
What is s? The add.s call used here is called a subtask, I talk more about subtasks in the canvas guide, where you can also learn about chain, which is a simpler way to chain tasks together. In practice the link execution option is considered an internal primitive, and you will probably not use it directly, but rather use chains instead.
Here the result of the first task (4) will be sent to a new task that adds 16 to the previous result, forming the expression (2 + 2) + 16 = 20 You can also cause a callback to be applied if task raises an exception (errback), but this behaves differently from a regular callback in that it will be passed the id of the parent task, not the result. This is because it may not always be possible to serialize the exception raised, and so this way the error callback requires a result backend to be enabled, and the task must retrieve the result of the task instead.
This is an example error callback: @app.task def error_handler(uuid): result = AsyncResult(uuid) exc = result.get(propagate=False) print(‘Task {0} raised exception: {1!r}\n{2!r}‘.format( uuid, exc, result.traceback))
it can be added to the task using the link_error execution option: add.apply_async((2, 2), link_error=error_handler.s()) In addition, both the link and link_error options can be expressed as a list: add.apply_async((2, 2), link=[add.s(16), other_task.s()]) The callbacks/errbacks will then be called in order, and all callbacks will be called with the return value of the parent task as a partial argument.
5. ETA and countdown
The ETA (estimated time of arrival) lets you set a specific date and time that is the earliest time at which your task will be executed. countdown is a shortcut to set eta by seconds into the future.
>>> result = add.apply_async((2, 2), countdown=3) >>> result.get() # this takes at least 3 seconds to return 20 The task is guaranteed to be executed at some time after the specified date and time, but not necessarily at that exact time. Possible reasons for broken deadlines may include many items waiting in the queue, or heavy network latency. To make sure your tasks are executed in a timely manner you should monitor the queue for congestion. Use Munin, or similar tools, to receive alerts, so appropriate action can be taken to ease the workload. See Munin.
While countdown is an integer, eta must be a datetime object, specifying an exact date and time (including millisecond precision, and timezone information): >>> from datetime import datetime, timedelta >>> tomorrow = datetime.utcnow() + timedelta(days=1) >>> add.apply_async((2, 2), eta=tomorrow) Expiration The expires argument defines an optional expiry time, either as seconds after task publish, or a specific date and time using datetime:
>>> # Task expires after one minute from now. >>> add.apply_async((10, 10), expires=60)
>>> # Also supports datetime >>> from datetime import datetime, timedelta >>> add.apply_async((10, 10), kwargs, ... expires=datetime.now() + timedelta(days=1) When a worker receives an expired task it will mark the task as REVOKED (TaskRevokedError).
6. Message Sending Retry
Celery will automatically retry sending messages in the event of connection failure, and retry behavior can be configured – like how often to retry, or a maximum number of retries – or disabled all together.
To disable retry you can set the retry execution option to False: add.apply_async((2, 2), retry=False) Related Settings: CELERY_TASK_PUBLISH_RETRY CELERY_TASK_PUBLISH_RETRY_POLICY Retry Policy A retry policy is a mapping that controls how retries behave, and can contain the following keys: . max_retries Maximum number of retries before giving up, in this case the exception that caused the retry to fail will be raised.
A value of 0 or None means it will retry forever. The default is to retry 3 times. . interval_start Defines the number of seconds (float or integer) to wait between retries. Default is 0, which means the first retry will be instantaneous. . interval_step On each consecutive retry this number will be added to the retry delay (float or integer). Default is 0.2. . interval_max Maximum number of seconds (float or integer) to wait between retries. Default is 0.2.
For example, the default policy correlates to: add.apply_async((2, 2), retry=True, retry_policy={ ‘max_retries‘: 3, ‘interval_start‘: 0, ‘interval_step‘: 0.2, ‘interval_max‘: 0.2, }) the maximum time spent retrying will be 0.4 seconds. It is set relatively short by default because a connection failure could lead to a retry pile effect if the broker connection is down: e.g. many web server processes waiting to retry blocking other incoming requests.
7. Serializers
Security The pickle module allows for execution of arbitrary functions, please see the security guide. Celery also comes with a special serializer that uses cryptography to sign your messages.
Data transferred between clients and workers needs to be serialized, so every message in Celery has a content_type header that describes the serialization method used to encode it. The default serializer is pickle, but you can change this using the CELERY_TASK_SERIALIZER setting, or for each individual task, or even per message. There’s built-in support for pickle, JSON, YAML and msgpack, and you can also add your own custom serializers by registering them into the Kombu serializer registry (see ref:kombu:guide-serialization).
Each option has its advantages and disadvantages. json – JSON is supported in many programming languages, is now a standard part of Python (since 2.6), and is fairly fast to decode using the modern Python libraries such as cjson or simplejson. The primary disadvantage to JSON is that it limits you to the following data types: strings, Unicode, floats, boolean, dictionaries, and lists. Decimals and dates are notably missing. Also, binary data will be transferred using Base64 encoding, which will cause the transferred data to be around 34% larger than an encoding which supports native binary types. However, if your data fits inside the above constraints and you need cross-language support, the default setting of JSON is probably your best choice. See http://json.org for more information. pickle – If you have no desire to support any language other than Python, then using the pickle encoding will gain you the support of all built-in Python data types (except class instances), smaller messages when sending binary files, and a slight speedup over JSON processing. See http://docs.python.org/library/pickle.html for more information. yaml – YAML has many of the same characteristics as json, except that it natively supports more data types (including dates, recursive references, etc.) However, the Python libraries for YAML are a good bit slower than the libraries for JSON. If you need a more expressive set of data types and need to maintain cross-language compatibility, then YAML may be a better fit than the above. See http://yaml.org/ for more information. msgpack – msgpack is a binary serialization format that is closer to JSON in features. It is very young however, and support should be considered experimental at this point. See http://msgpack.org/ for more information.
The encoding used is available as a message header, so the worker knows how to deserialize any task. If you use a custom serializer, this serializer must be available for the worker.
The following order is used to decide which serializer to use when sending a task: 1) The serializer execution option. 2) The Task.serializer attribute 3) The CELERY_TASK_SERIALIZER setting.
Example setting a custom serializer for a single task invocation: >>> add.apply_async((10, 10), serializer=‘json‘)
8. Compression Celery can compress the messages using either gzip, or bzip2. You can also create your own compression schemes and register them in the kombu compression registry. The following order is used to decide which compression scheme to use when sending a task: 1). The compression execution option. 2). The Task.compression attribute. 3). The CELERY_MESSAGE_COMPRESSION attribute.
Example specifying the compression used when calling a task: >>> add.apply_async((2, 2), compression=‘zlib‘)
9. Connections Automatic Pool Support Since version 2.3 there is support for automatic connection pools, so you don’t have to manually handle connections and publishers to reuse connections. The connection pool is enabled by default since version 2.5. See the BROKER_POOL_LIMIT setting for more information.
You can handle the connection manually by creating a publisher: results = [] with add.app.pool.acquire(block=True) as connection: with add.get_publisher(connection) as publisher: try: for args in numbers: res = add.apply_async((2, 2), publisher=publisher) results.append(res) print([res.get() for res in results])
Though this particular example is much better expressed as a group:
>>> from celery import group
>>> numbers = [(2, 2), (4, 4), (8, 8), (16, 16)] >>> res = group(add.subtask(n) for i in numbers).apply_async()
>>> res.get() [4, 8, 16, 32]
10. Routing options Celery can route tasks to different queues. Simple routing (name <-> name) is accomplished using the queue option: add.apply_async(queue=‘priority.high‘) You can then assign workers to the priority.high queue by using the workers -Q argument: $ celery worker -l info -Q celery,priority.high 参见 Hard-coding queue names in code is not recommended, the best practice is to use configuration routers (CELERY_ROUTES).
To find out more about routing, please see Routing Tasks.
11. Advanced Options These options are for advanced users who want to take use of AMQP’s full routing capabilities. Interested parties may read the routing guide. . exchange Name of exchange (or a kombu.entity.Exchange) to send the message to. . routing_key Routing key used to determine. . priority A number between 0 and 9, where 0 is the highest priority.