Practical Python项目：生成器在生产者-消费者模式与管道中的应用-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00585/article/details/148416323

Practical Python项目：生成器在生产者-消费者模式与管道中的应用

practical-python Practical Python Programming (course by @dabeaz) 项目地址: https://gitcode.com/gh_mirrors/pr/practical-python

生产者-消费者模式基础

在Python编程中，生成器(generator)是实现生产者-消费者模式的绝佳工具。这种模式的核心思想是将数据生成(生产)和数据处理(消费)分离，使代码更加模块化和可维护。

生成器通过yield语句实现数据的生产，而for循环则负责消费这些数据：

# 生产者函数
def follow(f):
    while True:
        line = f.readline()
        if not line:
            time.sleep(0.1)
            continue
        yield line  # 生产数据

# 消费者循环
for line in follow(f):  # 消费数据
    print(line)

这种模式的优势在于：

生产者和消费者可以独立开发和测试
内存效率高，不需要一次性加载所有数据
可以实现实时数据处理

构建数据处理管道

生成器可以像Unix管道一样连接起来，形成复杂的数据处理流程：

生产者 → 处理阶段1 → 处理阶段2 → 消费者

每个处理阶段都是一个生成器函数，它接收上游数据，处理后yield给下游。例如：

def producer():
    # 生产原始数据
    yield item

def processing_stage1(s):
    for item in s:
        # 处理数据
        yield new_item

def consumer(s):
    for item in s:
        # 使用最终数据
        print(item)

管道组装方式：

a = producer()
b = processing_stage1(a)
consumer(b)

实战案例：股票数据处理

让我们通过一个股票数据处理的例子来展示生成器管道的实际应用。

基础过滤管道

首先实现一个简单的过滤器，筛选包含特定字符串的行：

def filematch(lines, substr):
    for line in lines:
        if substr in line:
            yield line

使用方式：

lines = follow('stocklog.csv')  # 生产者
ibm_lines = filematch(lines, 'IBM')  # 处理阶段
for line in ibm_lines:  # 消费者
    print(line)

CSV数据处理管道

我们可以扩展这个管道，加入CSV解析功能：

import csv

lines = follow('stocklog.csv')
rows = csv.reader(lines)  # 将行解析为CSV格式
for row in rows:
    print(row)

高级数据处理管道

构建更复杂的数据处理流程：

def select_columns(rows, indices):
    for row in rows:
        yield [row[index] for index in indices]

def convert_types(rows, types):
    for row in rows:
        yield [func(val) for func, val in zip(types, row)]

def make_dicts(rows, headers):
    for row in rows:
        yield dict(zip(headers, row))

# 完整管道
lines = follow('stocklog.csv')
rows = csv.reader(lines)
rows = select_columns(rows, [0, 1, 4])  # 选择特定列
rows = convert_types(rows, [str, float, float])  # 类型转换
rows = make_dicts(rows, ['name', 'price', 'change'])  # 转为字典

数据过滤

添加基于投资组合的过滤功能：

def filter_symbols(rows, names):
    for row in rows:
        if row['name'] in names:
            yield row

# 使用投资组合文件过滤
portfolio = read_portfolio('portfolio.csv')
rows = filter_symbols(rows, portfolio)

构建完整解决方案

将所有这些组件整合成一个股票行情显示器：

def ticker(portfile, logfile, fmt):
    # 读取投资组合
    portfolio = read_portfolio(portfile)
    symbols = [s['name'] for s in portfolio]
    
    # 构建处理管道
    lines = follow(logfile)
    rows = csv.reader(lines)
    rows = select_columns(rows, [0, 1, 4])
    rows = convert_types(rows, [str, float, float])
    rows = make_dicts(rows, ['name', 'price', 'change'])
    rows = filter_symbols(rows, symbols)
    
    # 格式化输出
    if fmt == 'txt':
        print("%10s %10s %10s" % ('Name', 'Price', 'Change'))
        print("%10s %10s %10s" % ('-'*10, '-'*10, '-'*10))
        for row in rows:
            print("%10s %10.2f %10.2f" % (row['name'], row['price'], row['change']))
    elif fmt == 'csv':
        print('Name,Price,Change')
        for row in rows:
            print(f"{row['name']},{row['price']},{row['change']}")