celery自定义日志格式，自动为输出日志增加任务名（task name）和任务ID（task id）

最新推荐文章于 2024-12-25 19:02:54 发布

蘑菇猎手

最新推荐文章于 2024-12-25 19:02:54 发布

阅读量1.1w

点赞数 2

CC 4.0 BY-SA版权

分类专栏： Python

本文链接：https://blog.youkuaiyun.com/DongGeGe214/article/details/85686479

Python 专栏收录该内容

20 篇文章

订阅专栏

本文介绍如何在Celery中优化日志记录，通过在日志中加入任务ID和名称，便于日志追踪和问题排查。文章详细描述了在Debian环境下搭建Celery和Redis服务的过程，以及如何自定义日志格式，实现更高效的日志管理和任务执行监控。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

由于celery是并发执行任务，当打印日志在同一个文件时，不同进程之间的日志就交错堆叠在了一起，想要查询日志回溯某个问题时，总是非常困难。
如果每条log都能带上当前task的ID，就会方便很多。

核心代码：

from celery._state import get_current_task
task = get_current_task()
if task and task.request:
	task_id = task.request.id
	task_name = task.name

一、准备环境

建议使用root账户操作，以免安装、配置、运行权限不够。

操作系统：推荐Debian系列(或其它Linux，celery4.0开始不再支持windows)
- redis版本：4.0.8 (sudo apt-get install redis-server, sudo service redis-server start)
- python版本：3.6.5
  - celery：4.2.0 （使用pip3 install https://github.com/celery/celery/tarball/v4.2.0-136-gc1d0bfe 安装，pip install celery安装的包中的async模块和python关键字冲突）
  - redis：3.0.1 （pip install redis)

二、编写程序

celeryconfig.py

import logging

from celery._state import get_current_task

class Formatter(logging.Formatter):
    """Formatter for tasks, adding the task name and id."""

    def format(self, record):
        task = get_current_task()
        if task and task.request:
            record.__dict__.update(task_id='%s ' % task.request.id,
                                   task_name='%s ' % task.name)
        else:
            record.__dict__.setdefault('task_name', '')
            record.__dict__.setdefault('task_id', '')
        return logging.Formatter.format(self, record)


root_logger = logging.getLogger() # 返回logging.root
root_logger.setLevel(logging.DEBUG)

# 将日志输出到文件
fh = logging.FileHandler('celery_worker.log') # 这里注意不要使用TimedRotatingFileHandler，celery的每个进程都会切分，导致日志丢失
formatter = Formatter('[%(task_name)s%(task_id)s%(process)s %(thread)s %(asctime)s %(pathname)s:%(lineno)s] %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
fh.setFormatter(formatter)
fh.setLevel(logging.DEBUG)
root_logger.addHandler(fh)

# 将日志输出到控制台
sh = logging.StreamHandler()
formatter = Formatter('[%(task_name)s%(task_id)s%(process)s %(thread)s %(asctime)s %(pathname)s:%(lineno)s] %(levelname)s: %(message)s', datefmt='%Y-%m-%d %H:%M:%S')
sh.setFormatter(formatter)
sh.setLevel(logging.INFO)
root_logger.addHandler(sh)

class CeleryConfig(object):
    BROKER_URL = 'redis://localhost:6379/0'
    CELERY_RESULT_BACKEND = 'redis://localhost:6379/0'
    CELERY_TASK_SERIALIZER = 'pickle' # " json从4.0版本开始默认json,早期默认为pickle（可以传二进制对象）
    CELERY_RESULT_SERIALIZER = 'pickle'
    CELERY_ACCEPT_CONTENT = ['json', 'pickle']
    CELERY_ENABLE_UTC = True # 启用UTC时区
    CELERY_TIMEZONE = 'Asia/Shanghai' # 上海时区
    CELERYD_HIJACK_ROOT_LOGGER = False # 拦截根日志配置
    CELERYD_MAX_TASKS_PER_CHILD = 1 # 每个进程最多执行1个任务后释放进程（再有任务，新建进程执行，解决内存泄漏）

tasks.py

import logging

from celery import Celery, platforms

import celeryconfig
import detail

platforms.C_FORCE_ROOT = True # 配置里设置了序列化类型为pickle，操作系统开启允许
app = Celery(__name__)
app.config_from_object(celeryconfig.CeleryConfig)


@app.task(bind=True)
def heavy_task(self, seconds=1):
    logging.info("I'm heavy_task") # 默认使用logging.root
    return detail.process_heavy_task(seconds)

detail.py

import logging
import time

def process_heavy_task(seconds=1):
    logging.info("I'm process_heavy_task") # 默认使用logging.root
    time.sleep(seconds)
    return True

三、启动并测试

新建shell窗口，启动celery服务

# ls
celeyconfig.py	detail.py	tasks.py
# celery worker -A tasks -l info

新建shell窗口，监控日志文件

# ls
celeyconfig.py	celery_worker.log	detail.py	tasks.py
# tail -f celery_worker.log

新建shell窗口，调用celery任务

# ls
celeyconfig.py	celery_worker.log	detail.py	tasks.py
# python3
>>> import tasks
>>> t = tasks.heavy_task.delay(3)
>>> t.result
True

四、结果截图

启动worker
监控日志
调用任务