结合以下转载文章:
https://blog.youkuaiyun.com/zzhongcy/article/details/19810553
https://www.iteye.com/blog/iyuan-974040
PUSH/PULL模式示意图:
模型描述:
- 上游(任务发布)
- 工人(中间,具体工作)
- 下游(信号采集或者工作结果收集)
- 任务分发器会生成大量可以并行计算的任务;
- 有一组worker会处理这些任务;
- 结果收集器会在末端接收所有worker的处理结果,进行汇总。现实中,worker可能散落在不同的计算机中
实验代码:
上游代码(Ventilator.py):
# Ventilator.py
import zmq
import random
import time
context = zmq.Context()
# Socket to send messages on
sender = context.socket(zmq.PUSH)
sender.bind("tcp://*:5557")
print "Press Enter when the workers are ready: "
_ = raw_input()
print "Sending tasks to workers..."
# The first message is "0" and signals start of batch
sender.send('0')
# Initialize random number generator
random.seed()
# Send 100 tasks
total_msec = 0
for task_nbr in range(100):
# Random workload from 1 to 100 msecs
workload = random.randint(1, 100)
total_msec += workload
sender.send(str(workload))
print "Total expected cost: %s msec" % total_msec
工作代码(worker.py)
# worker.py
import sys
import time
import zmq
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.connect("tcp://localhost:5557")
# Socket to send messages to
sender = context.socket(zmq.PUSH)
sender.connect("tcp://localhost:5558")
# Process tasks forever
while True:
s = receiver.recv()
# Simple progress indicator for the viewer
sys.stdout.write('.')
sys.stdout.flush()
# Do the work
time.sleep(int(s)*0.001)
# Send results to sink
sender.send('')
下游代码(sink.py)
# sink.py
import sys
import time
import zmq
context = zmq.Context()
# Socket to receive messages on
receiver = context.socket(zmq.PULL)
receiver.bind("tcp://*:5558")
# Wait for start of batch
s = receiver.recv()
# Start our clock now
tstart = time.time()
# Process 100 confirmations
total_msec = 0
for task_nbr in range(100):
s = receiver.recv()
if task_nbr % 10 == 0:
sys.stdout.write(':')
else:
sys.stdout.write('.')
# Calculate and report duration of batch
tend = time.time()
print "Total elapsed time: %d msec" % ((tend-tstart)*1000)
注意点:
这种模式与pub/sub模式一样都是单向的,区别有两点:
1,该模式下在没有消费者的情况下,发布者的信息是不会消耗的(由发布者进程维护)
2,多个消费者消费的是同一列信息,假设A得到了一条信息,则B将不再得到
这种模式主要针对在消费者能力不够的情况下,提供的多消费者并行消费解决方案(也算是之前的pub/sub模式的那个"堵塞问题"的一个解决策略吧)
由上面的模型图可以看出,这是一个N:N的模式,在1:N的情况下,各消费者并不是平均消费的,而在N:1的情况下,则有所不同,如下图:
这种模式主要关注点在于,可以扩展中间worker,来到达并发的目的。