[心得]利用python并发提速上线测试效率_python开启并发-优快云博客

背景

在我们的工作中，分布式的生产环境要求我们提高测试效率。原先的上线job串行执行，严重制约了我们的上线效率。

我们从两个层面来解决问题：jenkins job层面，设置里面勾选Execute concurrent builds if necessary来实现多job并行。脚本层面，引入python并发来解决脚本内串行的问题。

取舍

关于多线程还是多进程的取舍。
如果是IO密集型，线程和进程都可以，相对而言，线程稍复杂。
如果是cpu密集型，那么多进程更合理。

线程

线程模型如下：
做一个线程池，总数为cpu数加1，每一个子类无限循环。

# coding=utf-8
from Queue import Queue
from threading import Thread
from single import *


class ProcessWorker(Thread):
    def __init__(self, queue):
        Thread.__init__(self)
        self.queue = queue

    def run(self):
        while True:
            # Get the work from the queue
            num = self.queue.get()
            processNum(num)
            self.queue.task_done()


def main():
    ts = time()
    nums = getNums(4)
    # Create a queue to communicate with the worker threads
    queue = Queue()
    # Create 4 worker threads
    for x in range(4):
        worker = ProcessWorker(queue)
        # Setting daemon to True will let the main thread exit even though the workers are blocking
        worker.daemon = True
        worker.start()
    # Put the tasks into the queue
    for num in nums:
        queue.put(num)
    # Causes the main thread to wait for the queue to finish processing all the tasks
    queue.join()
    print("cost time is: {:.2f}s".format(time() - ts))


if __name__ == "__main__":
    main()

进程

每当线程中的一个准备工作时，进程可以不断转换线程。使用Python或其他有GIL的解释型语言中的线程模块实际上会降低性能。如果你的代码执行的是CPU密集型的任务，例如解压gzip文件，使用线程模块将会导致执行时间变长。对于CPU密集型任务和真正的并行执行，我们可以使用多进程（multiprocessing）模块。
官方的Python实现——CPython——带有GIL.

为了使用多进程，我们得建立一个多进程池。通过它提供的map方法，我们把URL列表传给池。

# coding=utf-8
from functools import partial
from multiprocessing.pool import Pool
from single import *
from time import time


def main():
    ts = time()
    nums = getNums(4)
    p = Pool(4)
    p.map(processNum, nums)
    print("cost time is: {:.2f}s".format(time() - ts))


if __name__ == "__main__":
    main()