IO不占用cpu,计算占用cpu
python的多线程,不适合cpu密集操作型的任务,适合io操作密集型的任务(网络socket)
- 什么是线程(thread)?
线程之间内存共享,所以线程修改一份数据是必须加锁,mutex互斥锁
线程是操作系统能够进行运算调度的最小单位。它被包含在进程之中,是进程中的实际运作单位。
一条线程指的是进程中一个单一顺序的控制流,一个进程中可以并发多个线程,每条线程并行执行不同的任务
- 什么是进程
资源的集合,进程不能执行,通过调用线程来执行,且至少有一个线程
程序并不能单独运行,只有将程序装载到内存中,系统为它分配资源才能运行,而这种执行的程序就称之为进程
1、进程与线程的区别?
- 1、Threads share the address space of the process that created it; processes have their own address space.
线程共享内存空间,而进程的内存是独立的 - 2、Threads have direct access to the data segment of its process; processes have their own copy of the data segment of the parent process.
线程可以直接访问其进程的数据段;进程有自己的父进程数据段的副本 - 3、Threads can directly communicate with other threads of its process; processes must use interprocess communication to communicate with sibling processes.
线程可以直接与进程中的其他线程通信;进程必须使用进程间通信来与同级进程通信 - 4、New threads are easily created; new processes require duplication of the parent process.
Threads can exercise considerable control over threads of the same process; processes can only exercise control over child processes.
新线程很容易创建;新的进程需要父进程的复制。
线程可以对同一进程的线程进行相当大的控制;流程只能对子流程进行控制。 - 5、Changes to the main thread (cancellation, priority change, etc.) may affect the behavior of the other threads of the process; changes to the parent process does not affect child processes.
主线程的更改(取消、优先级更改等)可能会影响进程中其他线程的行为;对父进程的更改不会影响子进程。
一、多线程
2、最简单的多线程
- 1、直接调用
import threading,time
def run(n):
print("task",n)
time.sleep(2)
t1 = threading.Thread(target=run,args=("t1",))
t2 = threading.Thread(target=run,args=("t2",))
t1.start()
t2.start()
--->
task t1 #两行同时输出
task t2 #两行同时输出
- 2、通过类继承来调用
#-*- coding:utf-8 -*-
# Author: li Shang
import threading
class MyThread(threading.Thread):
def __init__(self,n):
super(MyThread,self).__init__()
self.n = n
def run(self):
print("task",self.n)
t1 = MyThread("t1")
t2 = MyThread("t2")
t1.start()
t2.start()
--->
task t1
task t2
- 3、直接调用thread,计算所有进程完毕花费的时间
#threading.current_thread() #当前执行的线程的信息
#threading.active_count() #当前活跃的线程数量
#-*- coding:utf-8 -*-
# Author: li Shang
import threading,time
def run(n):
print("task",n)
time.sleep(2)
print("%s has finished" % n)
t_objs = []
date = time.time()
for i in range(50):
t = threading.Thread(target=run,args=("t-%s" %i,))
t.start()
t_objs.append(t)
for t in t_objs: #循环线程实例列表,等待所有线程执行完毕
t.join()
print("主进程可以走了")
print("cost:" ,time.time() - date)
--->
task t-0
task t-1
...
task t-49
t-0 has finished
t-1 has finished
...
t-49 has finished
主进程可以走了
cost: 2.009114980697632
- 4、把子线程设为守护线程,当主线程执行完毕,该线程直接中断
#-*- coding:utf-8 -*-
# Author: li Shang
import threading,time
def run(n):
print("task",n)
time.sleep(2)
print("%s has finished" % n,threading.current_thread(),threading.active_count())
t_objs = []
date = time.time()
for i in range(50):
t = threading.Thread(target=run,args=("t-%s" %i,))
t.setDaemon(True) #把当前线程设置为守护线程
t.start()
t_objs.append(t)
print("主进程可以走了",threading.current_thread())
print("cost:" ,time.time() - date)
- 5、设置全局解释器锁,让子线程1修改数据的时候,别的线程不能修改
import threading,time
def run(n):
lock.acquire() #获取锁,这个锁存在,别的线程就无法修改
global num
num += 1
time.sleep(0.5)
print(num)
lock.release() #释放锁,这时别的线程才能修改
lock = threading.Lock()
num = 0
t_objs = []
date = time.time()
for i in range(50):
t = threading.Thread(target=run,args=("t-%s" %i,))
t.start()
t_objs.append(t)
for t in t_objs: #循环线程实例列表,等待所有线程执行完毕
t.join()
print("主进程可以走了",threading.current_thread())
print("num:",num)
- 6、递归锁
import threading, time
def run1():
print("grab the first part data")
lock.acquire()
global num
num += 1
lock.release()
return num
def run2():
print("grab the second part data")
lock.acquire()
global num2
num2 += 1
lock.release()
return num2
def run3():
lock.acquire()
res = run1()
print('--------between run1 and run2-----')
res2 = run2()
lock.release()
print(res, res2)
if __name__ == '__main__':
num, num2 = 0, 0
lock = threading.RLock() #递归锁,防止子线程卡死出不去
for i in range(10):
t = threading.Thread(target=run3)
t.start()
while threading.active_count() != 1:
print(threading.active_count())
else:
print('----all threads done---')
print(num, num2)
- 7、信号量(同一时刻有几个线程在执行)
import threading, time
def run(n):
semaphore.acquire()
time.sleep(1)
print("run the thread: %s\n" % n)
semaphore.release()
if __name__ == '__main__':
semaphore = threading.BoundedSemaphore(5) # 最多允许5个线程同时运行
for i in range(20):
t = threading.Thread(target=run, args=(i,))
t.start()
while threading.active_count() != 1:
pass # print threading.active_count()
else:
print('----all threads done---')
--->
###五个五个地出来
- 8、Events(事件)
An event is a simple synchronization object(同步对象);
the event represents an internal flag(内部的标志), and threads can wait for the flag to be set, or set or clear the flag themselves.
event = threading.Event()
#a client thread can wait for the flag to be set
event.wait()
#等待被设定
#a server thread can set or reset it
event.set()
#设置标志位
event.clear()
#清楚标志位
If the flag is set, the wait method doesn’t do anything.
标志位被设定,直接通行,wait直接通行,不会阻塞
If the flag is cleared, wait will block until it becomes set again.
标志位被清空,代表红灯,wait等待变绿灯
Any number of threads may wait for the same event.
import threading,time
import random
event = threading.Event()
def lighter():
count = 0
event.set() #先设为绿灯
while True:
if count > 5 and count <= 10: #改成红灯
event.clear() #把标志位清空
print("\033[41;1m红灯亮了...\033[0m %s" % count)
elif count > 10:
event.set() #变绿灯
count = 0
else:
print("\033[42;1m绿灯亮了...\033[0m %s" % count)
time.sleep(1)
count += 1
def car(name):
print("开车了")
while True:
if event.is_set(): #代表绿灯
print("[%s] will running..." % name)
time.sleep(1)
else:
print("[%s] seen red light, waitting..." % name)
event.wait()
print("\033[34;1m绿灯亮了,跑吧...\033[0m")
light = threading.Thread(target=lighter,)
light.start()
car1 = threading.Thread(target=car,args=("Tesla",))
car1.start()
- 9、队列queue
queue is especially useful in threaded programming when information must be exchanged safely between multiple threads.
class queue.Queue(maxsize=0) #先入先出
class queue.LifoQueue(maxsize=0) #last in fisrt out
class queue.PriorityQueue(maxsize=0) #存储数据时可设置优先级的队列
Constructor for a priority queue. maxsize is an integer that sets the upperbound limit on the number of items that can be placed in the queue. Insertion will block once this size has been reached, until queue items are consumed. If maxsize is less than or equal to zero, the queue size is infinite.
The lowest valued entries are retrieved first (the lowest valued entry is the one returned by sorted(list(entries))[0]). A typical pattern for entries is a tuple in the form: (priority_number, data).
exception queue.Empty
Exception raised when non-blocking get() (or get_nowait()) is called on a Queue object which is empty.
exception queue.Full
Exception raised when non-blocking put() (or put_nowait()) is called on a Queue object which is full.
Queue.qsize()
Queue.empty() #return True if empty
Queue.full() # return True if full
Queue.put(item, block=True, timeout=None)
Put item into the queue. If optional args block is true and timeout is None (the default), block if necessary until a free slot is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Full exception if no free slot was available within that time. Otherwise (block is false), put an item on the queue if a free slot is immediately available, else raise the Full exception (timeout is ignored in that case).
Queue.put_nowait(item)
Equivalent to put(item, False).
Queue.get(block=True, timeout=None)
Remove and return an item from the queue. If optional args block is true and timeout is None (the default), block if necessary until an item is available. If timeout is a positive number, it blocks at most timeout seconds and raises the Empty exception if no item was available within that time. Otherwise (block is false), return an item if one is immediately available, else raise the Empty exception (timeout is ignored in that case).
Queue.get_nowait()
Equivalent to get(False).
Two methods are offered to support tracking whether enqueued tasks have been fully processed by daemon consumer threads.
Queue.task_done()
Indicate that a formerly enqueued task is complete. Used by queue consumer threads. For each get() used to fetch a task, a subsequent call to task_done() tells the queue that the processing on the task is complete.
If a join() is currently blocking, it will resume when all items have been processed (meaning that a task_done() call was received for every item that had been put() into the queue).
Raises a ValueError if called more times than there were items placed in the queue.
Queue.join() block直到queue被消费完毕
###优先级队列
import queue
q = queue.PriorityQueue()
q.put((10,"lishang"))
q.put((3,"xiaomeng"))
q.put((6,"kaixin"))
q.put((-1,"keke"))
print(q.get())
print(q.get())
print(q.get())
print(q.get())
--->
(-1, 'keke')
(3, 'xiaomeng')
(6, 'kaixin')
(10, 'lishang')
#后进先出队列
import queue
q = queue.LifoQueue()
q.put(1)
q.put(2)
q.put(3)
print(q.get())
print(q.get())
print(q.get())
--->
3
2
1
###先进先出队列
import queue
q = queue.Queue()
q.put(1)
q.put(2)
q.put(3)
print(q.get())
print(q.get())
print(q.get())
--->
1
2
3
- 10
- 生产者消费者模型
#-*- coding:utf-8 -*-
# Author: li Shang
import threading,time
import queue
q = queue.Queue(maxsize=10) #队列中最多放10个
def Product(name):
print("%s 开始做骨头" % name)
count = 1
while True:
q.put("骨头%s" % count)
print("生产了多少根骨头",count)
count += 1
time.sleep(0.5)
def Consumer(name):
while True:
print("%s 取到 [%s] 并且吃了它..." %(name,q.get()))
time.sleep(1)
p = threading.Thread(target=Product,args=("LiShang",))
c = threading.Thread(target=Consumer,args=("赵晓萌",))
d = threading.Thread(target=Consumer,args=("孙凯欣",))
p.start()
c.start()
d.start()
二、多进程
- 1、简单的多进程
#-*- coding:utf-8 -*-
# Author: li Shang
import multiprocessing,threading
import time
def fun():
print(threading.get_ident())
def run(name):
time.sleep(1)
print('hello', name)
t = threading.Thread(target=fun)
t.start()
if __name__ == '__main__':
for i in range(10):
p = multiprocessing.Process(target=run,args=('lishang %s' %i,))
p.start()
# p.join()
- 2、每一个子进程都是由一个父进程启动地
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process
import os,time
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
print("\n")
def f(name):
info('\033[31;1mfunction f\033[0m')
print('hello', name)
if __name__ == '__main__':
info('\033[32;1mmain process line\033[0m')
p = Process(target=f, args=('bob',))
p.start()
p.join()
--->
main process line
module name: __main__
parent process: 8292
process id: 9848
function f
module name: __mp_main__
parent process: 9848 #子进程的父进程是上面程序的主进程
process id: 7244
hello bob
三、进程间通讯(数据的传递)
- 不同进程间内存是不共享的,要想实现两个进程间的数据交换,可以用以下方法:
1、Queues
###这就完成了不同进程间共享一份数据,可修改添加删除等
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Queue
def f(qq):
qq.put([42, None, 'hello']) #子线程可以往主进程的队列中添加数据
if __name__ == '__main__':
q = Queue()
p = Process(target=f, args=(q,))
p.start()
print(q.get()) #然后在主进程中可以查到
p.join()
--->
[42, None, 'hello']
2、Pipes(管道),类似于网络socket变成相互通信
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Pipe
def f(conn):
conn.send([42, None, 'hello from child']) #子进程给父进程发送数据
conn.send([42, None, 'hello from child2'])
print(conn.recv(),"I'm fine,thank you,and you?") #子进程接收来自父进程发的数据
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe()
p = Process(target=f, args=(child_conn,))
p.start()
print(parent_conn.recv()) #父进程接收子进程发的数据
print(parent_conn.recv())
parent_conn.send("儿子可好?") #父进程给子进程发送数据
p.join()
--->
[42, None, 'hello from child']
[42, None, 'hello from child2']
儿子可好? I'm fine,thank you,and you?
四、进程间通讯(数据的共享,可以修改同一份数据)
- 1、进程间共享数据
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Manager
import os,time
def f(d, l):
d[os.getpid()] = os.getpid()
l.append(os.getpid())
print(l)
if __name__ == '__main__':
with Manager() as manager:
d = manager.dict() #生成一个字典,可在多个进程间进行传递
l = manager.list(range(5)) #生成一个列表,可在多个进程间进行传递
p_list = []
for i in range(10):
p = Process(target=f, args=(d, l))
p.start()
p_list.append(p)
for res in p_list:
res.join()
print(d)
print(l)
--->
[0, 1, 2, 3, 4, 9928]
[0, 1, 2, 3, 4, 9928, 11116]
[0, 1, 2, 3, 4, 9928, 11116, 2492]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348, 9572]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348, 9572, 10428]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348, 9572, 10428, 6340]
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348, 9572, 10428, 6340, 7296]
{9928: 9928, 11116: 11116, 2492: 2492, 9360: 9360, 8624: 8624, 348: 348, 9572: 9572, 10428: 10428, 6340: 6340, 7296: 7296}
[0, 1, 2, 3, 4, 9928, 11116, 2492, 9360, 8624, 348, 9572, 10428, 6340, 7296]
- 2、进程锁
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Lock
def f(l, i):
l.acquire()
print('hello world', i)
l.release()
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
--->
hello world 0
hello world 1
hello world 2
hello world 4
hello world 7
hello world 5
hello world 6
hello world 8
hello world 9
hello world 3
进程锁的存在是为了让各个进程顺利执行完毕,防止出现
“hello wohello world 5”这种情况
- 3、进程池
进程池内部维护一个进程序列,当使用时,则去进程池中获取一个进程,如果进程池序列中没有可供使用的进进程,那么程序就会等待,直到进程池中有可用进程为止。
进程池有两个方法:
- apply(串行)
- apply_async(并行)
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Pool, freeze_support
import time,os
def Foo(i):
time.sleep(1)
print("the process id is :",os.getpid())
return i + 100
def Bar(arg):
print('-->exec done:', arg)
if __name__ == '__main__': #只有手动执行这个py脚本,这下面的才会执行
pool = Pool(processes=5) #允许进程池里同时放入5个进程
for i in range(10):
pool.apply(func=Foo, args=(i,)) #apply串行
print('end')
pool.close()
--->###一条一条的输出
the process id is : 10092
...
the process id is : 7996
end
- apply_async(并行)
#-*- coding:utf-8 -*-
# Author: li Shang
from multiprocessing import Process, Pool, freeze_support
import time,os
def Foo(i):
time.sleep(1)
print("the process id is :",os.getpid())
return i + 100
def Bar(arg):
print('-->exec done:', arg)
if __name__ == '__main__': #只有手动执行这个py脚本,这下面的才会执行
pool = Pool(processes=5) #允许进程池里同时放入5个进程
for i in range(10):
pool.apply_async(func=Foo, args=(i,), callback=Bar) #callback=回调
print('end')
pool.close()
pool.join() # 进程池中进程执行完毕后再关闭,如果注释,那么程序直接关闭。
--->##5个5个得输出,同时输出5个