python 中的进程和线程

在爬中开发中,进程和线程的概念是非常重要。以下是查找的学习材料以做笔记

1,多进程--使用multiprocessing模块创建多进程

multiprocessing模块提供 了一个Process类来描述一个进程对象。创建子进程时,只需要传入一个执行函数和函数参数,即可完成一个Process实例的创建,用start()方法启动进程,用join()方法实现进程间的同步。

import os
from multiprocessing import Process

#child process execute code
def run_proc(name):
    print (f'Child process {name} :{os.getpid()}')
if __name__ == '__main__':
    print(f'Parent process {os.getpid()}')
    for i in range(5):
        p = Process(target = run_proc, args = (str(i),))
        print(f'Process will start')
        p.start()
        print(f'the childProcess {i} is running')
    p.join()
    print('Process end')
2,multiprocessing模块提供了一个Pool类开代表进程池对象

Pool提供指定数量进程供用户调用,默认大小是CPU的核数,也可心指定。当有新的请求提交到Pool中时,如果池还没有满,那么就会创建一个新的进程用来执行该请求,但如果池中的进程数已经达到规定的最大值,那么该请求就会等待,直到池中有进程结束,

from multiprocessing import Pool
import os,time ,random

def run_task(name) :
    print(f'Task {name} ,{os.getpid()} is running...')
    time.sleep(random.random()*3)
    print(f'Task {name} end.')

if __name__=="__main__" :
    print(f'the mainProcess is {os.getpid()}')
    run_pool = Pool(processes=2) 
    for i in range(5) :
        run_pool.apply_async(run_task,args=(i,))
        print(f'Pool is start :{i}')

    run_pool.close()
    run_pool.join() 

ps:Pool对象调用join()方法会等待所有子进程执行完毕,调用join()之前必须先调用close(),调用close()之后,就不能继续添加新的process了

3,进程间通信————QUEUE 和 pipe

        Queue是多进程安全队列,有两个方法:put插入数据到队列,get从队列读取并且删除一个元素

rom multiprocessing import Queue, Process
import random,time,os

def process_write(q,urls):
    '''write in queue'''
    print(f'the write queue id is {os.getpid()}')
    for url in urls:
        q.put(url)
        print(f'the url is {url}')
        time.sleep(random.random())

def process_read(q):
    '''read in queue'''
    print(f'the read is {os.getpid()}')
    while True:
        url=q.get(True)
        print(f'the read url is {url}')
if __name__=="__main__" :
    q = Queue()
    write_process1 = Process(target=process_write,args=(q,['url1','url2','url3']))
    write_process2 = Process(target=process_write,args=(q,['url4','url5','url6']))
    read_process = Process(target=process_read,args=(q,))
    write_process1.start()
    write_process2.start()
    read_process.start()
    write_process1.join()
    write_process2.join()
    read_process.terminate()
4,pipe通信机制

Pipe常用来在两个进程间通信,两个进程分别位于管道的两端。

Pipe方法返回(conn1,conn2)代表一个管道的两个端,Pipe方法有duplex参数,如果duplex为TRUE,即为全双工模式,两端均可以收发。如为False,conn1只负责收,con2只负责发。send,recv方法分别是发和收

import multiprocessing
import random,os,time

def proc_send(pip,urls) :
    #print(f'the Process {os.getpid()} send')
    for url in urls:
        pip.send(url)
        print(f'the Process {os.getpid()} send url is {url}')
        time.sleep(random.random())

def proc_recv(pip):
    #print(f'the process recv{os.getpid()} ')
    while True:
        print(f'the Process {os.getpid()},{pip.recv()}')
        time.sleep(random.random())
    #print(f'recv is {re}')

if __name__=="__main__" :
    print(f'the main process is {os.getpid()}')
    pip = multiprocessing.Pipe()
    proccess_send=multiprocessing.Process(target=proc_send,args=(pip[0],['url_'+ str(i) for i in range(10)]))
    proccess_recv=multiprocessing.Process(target=proc_recv,args=(pip[1],))
    proccess_send.start()
    proccess_recv.start()
    proccess_send.join()
    proccess_recv.join()
    print(f'the main process is over')
5,多线程

应用场景:运行时间长的任务放后台,需要等待的任务实现上,如网络收发数据。

两种方式创建多线程,第一种把一个函数传入并创建Thread实例,再调用start,

第二种直接继承threading.Thread。重写__init__方法和run方法

import threading
import os,time,random

def threading_run(urls):
    print(f'the threading name is {threading.current_thread().name}---{os.getpid()}')
    for url in urls :
        print(f'the the {threading.current_thread().name} is {url}')
        time.sleep(random.random())
        print(f'the threading end {threading.current_thread().name}')

t1 = threading.Thread(target=threading_run,name='t1',args=(['url_1','url_2','url_3'],))
t2 = threading.Thread(target=threading_run,name='t2', args=(['url_4','url_5','url_6'],))
t1.start()
t2.start()
t1.join()
t2.join()

第二种继承

import threading
import time,random

class MyThread(threading.Thread):
    def __init__(self, name,urls) -> None:
        threading.Thread.__init__(self, name=name )
        self.urls=urls

    def run(self):
        print(f'the thread name is {threading.current_thread().name}')
        for url in self.urls :
            print(f'the threading name is {threading.current_thread().name},the url is {url}')
            time.sleep(random.random())    
            print(f'the threading {threading.current_thread().name} is ended')    

t1 = MyThread(name='t1',urls=['url_1','url_2','url_3'])
t2 = MyThread(name='t2', urls=['url_4','url_5','url_6'])
t1.start()
t2.start()
t1.join()
t2.join()        

参考文献:

1,Python 的 Gevent --- 高性能的 Python 并发框架-优快云博客 

2,Python 高级编程之并发与多线程(三)_python3多线程并发_大数据老司机的博客-优快云博客

3,后端编程Python3-多进程与多线程

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值