Python3使用队列进行并发日志处理

本文介绍了一种在Python多进程环境中实现统一日志记录的方法。通过使用消息队列和独立监听进程,确保了日志记录的线程安全性和进程间的协调。文章提供了具体的代码示例,并展示了如何利用QueueListener简化日志监听器的设计。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

简单来说logging是线程安全,而非进程安全,因此想要将多个进程的日志输出到同一文件,需要借助消息队列或者socket,再在单一进程中监听该队列或socket,输出至文件中

Although logging is thread-safe, and logging to a single file from multiple threads in a single process is supported, logging to a single file from multiple processes is not supported, because there is no standard way to serialize access to a single file across multiple processes in Python. If you need to log to a single file from multiple processes, one way of doing this is to have all the processes log to a SocketHandler, and have a separate process which implements a socket server which reads from the socket and logs to file. (If you prefer, you can dedicate one thread in one of the existing processes to perform this function.) This section documents this approach in more detail and includes a working socket receiver which can be used as a starting point for you to adapt in your own applications.

You could also write your own handler which uses the Lock class from the multiprocessing module to serialize access to the file from your processes. The existing FileHandler and subclasses do not make use of multiprocessing at present, though they may do so in the future. Note that at present, the multiprocessing module does not provide working lock functionality on all platforms (see https://bugs.python.org/issue3770).

Alternatively, you can use a Queue and a QueueHandler to send all logging events to one of the processes in your multi-process application. The following example script demonstrates how you can do this; in the example a separate listener process listens for events sent by other processes and logs them according to its own logging configuration. Although the example only demonstrates one way of doing it (for example, you may want to use a listener thread rather than a separate listener process – the implementation would be analogous) it does allow for completely different logging configurations for the listener and the other processes in your application, and can be used as the basis for code meeting your own specific requirements:

我在官方示例代码的基础上进行了一些修改:

import multiprocessing
import logging
import logging.handlers
import time

logger_pool = {}


class Loggers:
    global logger_pool

    def get_listener_logger(self, id):
        formatter = logging.Formatter("%(message)s")
        logger = logging.getLogger(id)
        logger.setLevel(logging.INFO)

        #handler = logging.handlers.RotatingFileHandler('log.log') 
        handler = logging.StreamHandler() 
        handler.setFormatter(formatter)
        logger.addHandler(handler)
        return logger

    def get_worker_logger(self, id, queue):
        """
        如果池中存在则取出
        如果不存在则创建
        """
        if logger_pool.get(id):
            return logger_pool.get(id)
        else:
            """
            创建日志实例
            """
            formatter = logging.Formatter("[%(asctime)s] %(name)s:%(levelname)s: %(message)s")
            logger = logging.getLogger(id)
            logger.setLevel(logging.INFO)

            handler = logging.handlers.QueueHandler(queue)
            handler.setFormatter(formatter)
            logger.addHandler(handler)
            logger_pool[id] = logger
            return logger

logger_class = Loggers()

def listener_process(queue, flag_queue):
    listener_logger = logger_class.get_listener_logger('listener')
    while True:
        if queue.empty() and flag_queue.empty():
            print('listener stop!')
            break
        else:
            try:
                record = queue.get(timeout=2)
            except:
                continue
            listener_logger.handle(record)

def worker_process(id, queue, flag_queue):
    try:
        logger = logger_class.get_worker_logger(id, queue)
        for _ in range(10):
            logger.info(time.time())
            time.sleep(1)
    finally:
        print('worker stop!')
        flag_queue.get()
    
if __name__ == "__main__":
    queue = multiprocessing.Queue(-1)
    flag_queue = multiprocessing.Queue(-1)
    
    id_list = ['00', '01', '02', '03']
    process_pool = []
    for id in id_list:
        flag_queue.put(id)

        p = multiprocessing.Process(target=worker_process, args=(id, queue, flag_queue,))
        p.start()
        process_pool.append(p)

    listener = multiprocessing.Process(
        target=listener_process, args=(queue, flag_queue,))
    listener.start()
    
    for p in process_pool:
        p.join()
    listener.join()
[2021-02-28 17:37:50,567] 00:INFO: 1614505070.567073
[2021-02-28 17:37:50,567] 01:INFO: 1614505070.567591
[2021-02-28 17:37:50,568] 02:INFO: 1614505070.568983
[2021-02-28 17:37:50,569] 03:INFO: 1614505070.569498
[2021-02-28 17:37:51,570] 02:INFO: 1614505071.570774
[2021-02-28 17:37:51,570] 01:INFO: 1614505071.570773
[2021-02-28 17:37:51,570] 00:INFO: 1614505071.570755
[2021-02-28 17:37:51,570] 03:INFO: 1614505071.570774
[2021-02-28 17:37:52,573] 03:INFO: 1614505072.572994
[2021-02-28 17:37:52,573] 00:INFO: 1614505072.572965
[2021-02-28 17:37:52,573] 01:INFO: 1614505072.572991
[2021-02-28 17:37:52,573] 02:INFO: 1614505072.572993

参考:
https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes
https://fanchenbao.medium.com/python3-logging-with-multiprocessing-f51f460b8778


2021-08-20更新

上面的listener可以使用logging.handlers.QueueListener进行优化
优化后的代码如下

import multiprocessing
import logging
import logging.handlers
import time

logger_pool = {}


class Loggers:
    global logger_pool

    def get_listener_logger(self, id):
        formatter = logging.Formatter("%(message)s")
        logger = logging.getLogger(id)
        logger.setLevel(logging.INFO)

        #handler = logging.handlers.RotatingFileHandler('log.log') 
        handler = logging.StreamHandler() 
        handler.setFormatter(formatter)
        logger.addHandler(handler)
        return logger

    def get_worker_logger(self, id, queue):
        """
        如果池中存在则取出
        如果不存在则创建
        """
        if logger_pool.get(id):
            return logger_pool.get(id)
        else:
            """
            创建日志实例
            """
            formatter = logging.Formatter("[%(asctime)s] %(name)s:%(levelname)s: %(message)s")
            logger = logging.getLogger(id)
            logger.setLevel(logging.INFO)

            handler = logging.handlers.QueueHandler(queue)
            handler.setFormatter(formatter)
            logger.addHandler(handler)
            logger_pool[id] = logger
            return logger

logger_class = Loggers()

def listener_process(queue):
    formatter = logging.Formatter("%(message)s")
    handler = logging.StreamHandler()
    handler.setFormatter(formatter)
    listener = logging.handlers.QueueListener(queue, handler)
    return listener

def worker_process(id, queue):
    try:
        logger = logger_class.get_worker_logger(id, queue)
        for _ in range(10):
            logger.info(time.time())
            time.sleep(1)
    finally:
        print('worker stop!')
    
if __name__ == "__main__":
    queue = multiprocessing.Queue(-1)
    
    id_list = ['00', '01', '02', '03']
    process_pool = []
    for id in id_list:
        p = multiprocessing.Process(target=worker_process, args=(id, queue,))
        p.start()
        process_pool.append(p)

    listener = listener_process(queue)
    listener.start()
    
    for p in process_pool:
        p.join()
    listener.stop()

与原来的代码进行对比后发现原先使用单独进程进行队列监听的方式改为了使用主进程的子线程进行监听,大大减少了代码量
在这里插入图片描述
代码对比结果:https://www.diffchecker.com/aHZtLRo8
参考:https://blog.youkuaiyun.com/nilnaijgnid/article/details/107498878
https://docs.python.org/3/library/logging.handlers.html#queuelistener

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值