借鉴大拿老师进的Python,今天写一点多进程多线程,一起学习,互相进步,感谢大拿老师!
# 多线程 vs 多进程
1.进程:
进程就是一个程序在一个数据集上的一次动态执行过程。进程一般由程序、数据集、进程控制块三部分组成
- 程序运行的一个状态
- 包含地址空间,内存,数据栈等
- 每个进程由自己完全独立的运行环境
2.线程:
线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务,
线程也叫轻量级进程,它是一个基本的CPU执行单元,也是程序执行过程中的最小单元,由线程ID、程序计数器、寄存器集合和堆栈共同组成。线程的引入减小了程序并发执行时的开销,提高了操作系统的并发性能。线程没有自己的系统资源。
- 一个进程的独立运行片段,一个进程可以由多个线程
- 轻量化的进程
- 一个进程的多个现成间共享数据和上下文运行环境
- 共享互斥问题
进程和线程的关系:
(1)一个线程只能属于一个进程,而一个进程可以有多个线程,但至少有一个线程。
(2)资源分配给进程,同一进程的所有线程共享该进程的所有资源。
(3)CPU分给线程,即真正在CPU上运行的是线程。
3.全局解释器锁(GIL):
- Python代码的执行是由python虚拟机进行控制
- 在主循环中只有一个控制线程在执行
-
在python多线程下,每个线程的执行方式:
获取GIL
执行代码直到sleep或者是python虚拟机将其挂起
释放GIL
4.Python包:
- threading: 通用的包
顺序执行,耗时较长
import time
def loop1():
# ctime得到当前时间
print('start loop 1 at:', time.ctime())
time.sleep(4)
print('End loop 1 at:', time.ctime())
def loop2():
# ctime得到当前时间
print('start loop 2 at:', time.ctime())
time.sleep(2)
print('End loop 2 at:', time.ctime())
def main():
print('string at:', time.ctime())
loop1()
loop2()
print('all done at:', time.ctime())
if __name__ == '__main__':
main()
改用多线程,缩短总时间,使用_thread
import time
import _thread as thread
def loop1():
# ctime得到当前时间
print('start loop 1 at:', time.ctime())
time.sleep(4)
print('End loop 1 at:', time.ctime())
def loop2():
# ctime得到当前时间
print('start loop 2 at:', time.ctime())
time.sleep(2)
print('End loop 2 at:', time.ctime())
def main():
print('string at:', time.ctime())
# loop1()
# loop2()
thread.start_new_thread(loop1, ())
thread.start_new_thread(loop2, ())
print('all done at:', time.ctime())
if __name__ == '__main__':
main()
while True:
time.sleep(3)
多线程,传参数
import time
import _thread as thread
def loop1(in1):
# ctime得到当前时间
print('start loop1 at:', time.ctime())
print("参数:", in1)
time.sleep(4)
print('End loop1 at:', time.ctime())
def loop2(in1, in2):
# ctime得到当前时间
print('start loop2 at:', time.ctime())
print("参数1:", in1, "参数2:", in2)
time.sleep(2)
print('End loop2 at:', time.ctime())
def main():
print('all loop at:', time.ctime())
# loop1(1)
# loop2(1, 2)
thread.start_new_thread(loop1, ('有逗号', ))
thread.start_new_thread(loop2, ('两个', '两个参数'))
print('end loop at:', time.ctime())
if __name__ == '__main__':
main()
while True:
time.sleep(2)
5.threading的使用:
- 直接利用threading.Thread生成Thread实例
- t = threading.Thread(target=xxx, args=(xxx,))
- t.start():启动多线程
- t.join(): 等待多线程执行完成在,子线程完成运行之前,这个子线程的父线程将一直被阻塞
开启t.start()
import time
import threading
def loop1(in1):
# ctime得到当前时间
print('start loop1 at:', time.ctime())
print("参数:", in1)
time.sleep(4)
print('End loop1 at:', time.ctime())
def loop2(in1, in2):
# ctime得到当前时间
print('start loop2 at:', time.ctime())
print("参数1:", in1, "参数2:", in2)
time.sleep(2)
print('End loop2 at:', time.ctime())
def main():
print('all loop at:', time.ctime())
# loop1(1)
# loop2(1, 2)
t1 = threading.Thread(target=loop1, args=('有逗号', ))
t1.start()
t2 = threading.Thread(target=loop2, args=('两个', '两个参数'))
t2.start()
print('end loop at:', time.ctime())
if __name__ == '__main__':
main()
加入join后比较跟案例04的结果的异同,输出结果会有改变
import time
import threading
def loop1(in1):
# ctime得到当前时间
print('start loop1 at:', time.ctime())
print("参数:", in1)
time.sleep(4)
print('End loop1 at:', time.ctime())
def loop2(in1, in2):
# ctime得到当前时间
print('start loop2 at:', time.ctime())
print("参数1:", in1, "参数2:", in2)
time.sleep(2)
print('End loop2 at:', time.ctime())
def main():
print('all loop at:', time.ctime())
# loop1(1)
# loop2(1, 2)
t1 = threading.Thread(target=loop1, args=('有逗号', ))
t1.start()
t2 = threading.Thread(target=loop2, args=('两个', '两个参数'))
t2.start()
t1.join()
t2.join()
print('end loop at:', time.ctime())
if __name__ == '__main__':
main()
while True:
time.sleep(2)
守护线程-daemon
- 如果在程序中将子线程设置成守护现成,则子线程会在主线程结束的时候自动退出
- 一般认为,守护线程不中要或者不允许离开主线程独立运行
- 守护线程案例能否有效果跟环境相关
非守护线程,所有print都可以正常进行打印
import time
import threading
def fun():
print("start fun")
time.sleep(2)
print("end fun")
print("main thread")
t1 = threading.Thread(target=fun, args=())
t1.start()
time.sleep(1)
print("main tread end")
守护线程,一旦主线程结束,end fun 不会被打印
import time
import threading
def fun():
print("start fun")
time.sleep(2)
print("end fun")
print("start thread")
t1 = threading.Thread(target=fun, args=())
# 守护必须设置在start之前
t1.setDaemon(True)
t1.start()
time.sleep(1)
print("end thread")
线程常用属性
- threading.currentThread:返回当前线程变量
- threading.enumerate:返回一个包含正在运行的线程的list,正在运行的线程指的是线程启动后,结束前的状态
- threading.activeCount: 返回正在运行的线程数量,效果跟 len(threading.enumerate)相同
- threading.setName: 给线程设置名字
- threading.getName: 得到线程的名字
import time import threading def loop1(): print('start loop1 at:', time.ctime()) time.sleep(4) print('End loop1 at:', time.ctime()) def loop2(): print('start loop2 at:', time.ctime()) time.sleep(2) print('End loop2 at:', time.ctime()) def loop3(): print('start loop3 at:', time.ctime()) time.sleep(6) print('End loop3 at:', time.ctime()) def main(): print("starting at:", time.ctime()) t1 = threading.Thread(target=loop1, args=()) t1.setName("tre1") t1.start() t2 = threading.Thread(target=loop2, args=()) t2.setName("tre2") t2.start() t3 = threading.Thread(target=loop3, args=()) t3.setName("tre3") t3.start() time.sleep(3) for thr in threading.enumerate(): print("正在运行的线程名字:{0}".format(thr.getName())) print("正在运行的子线程的数量:{0}".format(threading.activeCount())) print("all done at: ", time.ctime()) if __name__ == "__main__": main() while True: time.sleep(6)
直接继承自threading.Thread
- 直接继承Thread
- 重写run函数
- 类实例可以直接运行
import time import threading # 类需要继承threading.Thread class MyThread(threading.Thread): def __init__(self, arg): super(MyThread, self).__init__() self.arg = arg # 必须重写run,run函数代表真正执行的函数的功能 def run(self): time.sleep(2) print("The args for this is {0}".format(self.arg)) for i in range(5): t = MyThread(i) t.start() t.join() print("main thread is done!!!!")
工业风案例(自己观点,把需要传的参数进行封住,然后直接传入参数,对代码进行整理)
import time import threading # loop = [4, 2] class ThreadFunc(object): def __init__(self, name): self.name = name def loop(self, nloop, nsec): ''' :param nloop: 函数名称 :param nsec: 系统休眠时间 :return: ''' print("Start loop", nloop, "at", time.ctime()) time.sleep(nsec) print("Done loop", nloop, "at", time.ctime()) def main(): print("Starting at:", time.ctime()) # ThreadFunc("loop").loop 跟以下两个式子相等: # t = ThreadFunc("loop") # t.loop # 以下t1 t2 的定义方式相同 t = ThreadFunc("loop") t1 = threading.Thread(target=t.loop, args=("loop1", 4)) # 下面这种写法西方化,工业化 t2 = threading.Thread(target=ThreadFunc("loop").loop, args=("loop2", 2)) t1.start() t2.start() t1.join() t2.join() print("all done at:", time.ctime()) if __name__ == "__main__": main() while True: time.sleep(5)
6.共享变量:
- 共享变量: 当多个现成同时访问一个变量的时候,会产生共享变量的问题
import threading loopSum = 1000 sum = 0 def myAdd(): global sum, loopSum for i in range(1, loopSum): sum += 1 def myMinu(): global sum, loopSum for i in range(1, loopSum): sum -= 1 if __name__ == '__main__': print("starting ....{0}".format(sum)) t1 = threading.Thread(target=myAdd, args=()) t2 = threading.Thread(target=myMinu, args=()) t1.start() t2.start() t1.join() t2.join() print("Done...{0}".format(sum))
7.锁(Lock):
- 是一个标志,表示一个线程在占用一些资源
- 使用方法
- 上锁
- 使用共享资源,放心的用
- 取消锁,释放锁
- 锁谁: 哪个资源需要多个线程共享,锁哪个
- 理解锁:锁其实不是锁住谁,而是一个令牌
import threading loopSum = 1000 sum = 0 lock = threading.Lock() def myAdd(): global sum, loopSum for i in range(1, loopSum): # 上锁 lock.acquire() sum += 1 # 释放锁 lock.release() def myMinu(): global sum, loopSum for i in range(1, loopSum): lock.acquire() sum -= 1 lock.release() if __name__ == '__main__': print("starting ....{0}".format(sum)) t1 = threading.Thread(target=myAdd, args=()) t2 = threading.Thread(target=myMinu, args=()) t1.start() t2.start() t1.join() t2.join() print("Done...{0}".format(sum))
8.线程安全问题:
- 如果一个资源/变量,他对于多线程来讲,不用加锁也不会引起任何问题,则称为线程安全
- 线程不安全变量类型: list, set, dict
- 线程安全变量类型: queue
9.生产者消费者问题:
- 一个模型,可以用来搭建消息队列
- queue是一个用来存放变量的数据结构,特点是先进先出,内部元素排队,可以理解成一个特殊的list
import threading import time import queue class Produce(threading.Thread): def run(self): global queue count = 0 while True: if queue.qsize() < 1000: for i in range(100): count += 1 msg = '生成产品' + str(count) queue.put(msg) print(msg) time.sleep(0.5) class Consumer(threading.Thread): def run(self): global queue count = 0 while True: if queue.qsize() > 100: for i in range(3): count += 1 msg = self.name + '消费了' + queue.get() print(msg) time.sleep(1) if __name__ == '__main__': queue = queue.Queue() for i in range(500): queue.put('初始生产' + str(i)) for i in range(2): p = Produce() p.start() for i in range(5): c = Consumer() c.start()
- 死锁问题
import threading import time lock_1 = threading.Lock() lock_2 = threading.Lock() def func_1(): print("func_1 starting.........") lock_1.acquire() print("func_1 申请了 lock_1....") time.sleep(2) print("func_1 等待 lock_2.......") lock_2.acquire() print("func_1 申请了 lock_2.......") lock_2.release() print("func_1 释放了 lock_2") lock_1.release() print("func_1 释放了 lock_1") print("func_1 done..........") def func_2(): print("func_2 starting.........") lock_2.acquire() print("func_2 申请了 lock_2....") time.sleep(4) print("func_2 等待 lock_1.......") lock_1.acquire() print("func_2 申请了 lock_1.......") lock_1.release() print("func_2 释放了 lock_1") lock_2.release() print("func_2 释放了 lock_2") print("func_2 done..........") if __name__ == "__main__": print("主程序启动..............") t1 = threading.Thread(target=func_1, args=()) t2 = threading.Thread(target=func_2, args=()) t1.start() t2.start() t1.join() t2.join() print("主程序启动..............")
- 锁的等待时间问题
import time import threading lock_1 = threading.Lock() lock_2 = threading.Lock() def func_1(): print("func_1 starting...") lock_1.acquire(timeout=4) print("func_1 申请 lock_1....") time.sleep(2) print("func_1 等待 lock_2") rst = lock_2.acquire(timeout=2) if rst: print("func_1 已经得到了 lock_2") lock_2.release() print("func_1 释放了锁 lock_2") else: print("func_1 没有申请到lock_2") lock_1.release() print("func_1 释放了 lock_1") print("func_1 done....") def func_2(): print("func_2 starting.........") lock_2.acquire() print("func_2 申请了 lock_2....") time.sleep(4) print("func_2 等待 lock_1.......") lock_1.acquire() print("func_2 申请了 lock_1.......") lock_1.release() print("func_2 释放了 lock_1") lock_2.release() print("func_2 释放了 lock_2") print("func_2 done..........") if __name__ == '__main__': print('主线程启动') t1 = threading.Thread(target=func_1, args=()) t2 = threading.Thread(target=func_2, args=()) t1.start() t2.start() t1.join() t2.join() print("all thread end")
- semphore
- 允许一个资源最多由几个多线程同时使用
import threading import time # 参数定义最多几个同时使用资源 semaphore = threading.Semaphore(3) def func(): if semaphore.acquire(5): print(threading.currentThread().getName() + 'get semaphore' + time.ctime()) time.sleep(15) semaphore.release() print(threading.currentThread().getName() + "release semaphore" + time.ctime()) for i in range(6): t1 = threading.Thread(target=func) t1.start()
10.threading.Timer:
- Timer是利用多线程,在指定时间后启动一个功能
import threading import time def func(): print("I am running...") time.sleep(4) print("I am done...") if __name__ == '__main__': t = threading.Timer(2, func) t.start() # i = 0 # while True: for i in range(5): print("{0}****".format(i)) time.sleep(3) i += 1
11.可重入锁:
- 一个锁,可以被一个线程多次申请
- 主要解决递归调用的时候,需要申请锁的情况
import threading import time class MyThread(threading.Thread): def run(self): global num time.sleep(1) if mutex.acquire(3): num = num + 1 msg = self.name + 'set num to' + str(num) print(msg) mutex.acquire() mutex.release() mutex.release() num = 0 mutex = threading.RLock() def func_1(): for i in range(5): t = MyThread() t.start() if __name__ == '__main__': func_1()
# 线程替代方案
1. subprocess
完全跳过线程,使用进程
是派生进程的主要替代方案
2. multiprocessiong
使用threadiing借口派生,使用子进程
允许为多核或者多cpu派生进程,接口跟threading非常相似
3. concurrent.futures
新的异步执行模块
任务级别的操作
# 多进程
进程间通讯(InterprocessCommunication, IPC )
进程之间无任何共享状态
进程的创建
直接生成Process实例对象
import time
import multiprocessing
def clock(interval):
while True:
print("The time is %s" % time.ctime())
time.sleep(interval)
if __name__ == '__main__':
p = multiprocessing.Process(target=clock, args=(5, ))
print("sleep...")
p.start()
派生子类
import multiprocessing
from time import sleep, ctime
class ClockProcess(multiprocessing.Process):
'''
两个函数比较
init
run
'''
def __init__(self, interval):
super().__init__()
self.interval = interval
def run(self):
while True:
print("The time is %s" % ctime())
sleep(self.interval)
if __name__ == '__main__':
p = ClockProcess(3)
p.start()
while True:
print('sleep...')
sleep(1)
在os中查看pid,ppid以及他们的关系
import multiprocessing
import os
def info(title):
print(title)
print("module name:", __name__)
print("parent process:", os.getppid())
print("process id:", os.getpid())
def f(name):
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
p = multiprocessing.Process(target=f, args=('bob', ))
p.start()
p.join()
生产者消费者模型
JoinableQueue
import multiprocessing
from time import ctime
def consumer(input_q):
print("Into consumer:", ctime())
while True:
item = input_q.get()
print("pull", item, "out of q")
input_q.task_done()
print("Out of consumer:", ctime)
def producer(sequence, output_q):
print("Into producer:", ctime())
for item in sequence:
output_q.put(item)
print("put", item, "into q")
print("Out of producer:", ctime())
if __name__ == '__main__':
q = multiprocessing.JoinableQueue()
cons_p = multiprocessing.Process(target=consumer, args=(q, ))
cons_p.daemon = True
cons_p.start()
sequence = [1, 2, 3, 4]
producer(sequence, q)
q.join()
cons_p.join()
队列中哨兵的使用
import multiprocessing
from time import ctime
def consumer(input_q):
print("Into consumer:", ctime())
while True:
item = input_q.get()
if item is None:
break
print("pull", item, "out of q")
input_q.task_done()
print("Out of consumer:", ctime)
def producer(sequence, output_q):
print("Into producer:", ctime())
for item in sequence:
output_q.put(item)
print("put", item, "into q")
print("Out of producer:", ctime())
if __name__ == '__main__':
q = multiprocessing.JoinableQueue()
cons_p = multiprocessing.Process(target=consumer, args=(q, ))
# cons_p.daemon = True
cons_p.start()
sequence = [1, 2, 3, 4]
producer(sequence, q)
q.put(None)
q.join()
cons_p.join()
哨兵的改进
import multiprocessing
from time import ctime
def consumer(input_q):
print("Into consumer:", ctime())
while True:
item = input_q.get()
if item is None:
break
print("pull", item, "out of q")
input_q.task_done()
print("Out of consumer:", ctime)
def producer(sequence, output_q):
print("Into producer:", ctime())
for item in sequence:
output_q.put(item)
print("put", item, "into q")
print("Out of producer:", ctime())
if __name__ == '__main__':
q = multiprocessing.JoinableQueue()
cons_p = multiprocessing.Process(target=consumer, args=(q, ))
cons_p.daemon = True
cons_p.start()
sequence = [1, 2, 3, 4]
producer(sequence, q)
q.put(None)
q.put(None)
q.join()
cons_p.join()