Python生成器知多少-优快云博客

Generators are a simple and powerful tool for creating iterators. They are written like regular functions but use the yield statement whenever they want to return data. Each time next() is called on it, the generator resumes where it left off (it remembers all the data values and which statement was last executed).

Anything that can be done with generators can also be done with class-based iterators as described in the previous section. What makes generators so compact is that the __iter__() and __next__() methods are created automatically.

Another key feature is that the local variables and execution state are automatically saved between calls. This made the function easier to write and much more clear than an approach using instance variables like self.index and self.data.

In addition to automatic method creation and saving program state, when generators terminate, they automatically raise StopIteration. In combination, these features make it easy to create iterators with no more effort than writing a regular function.

理解

方法中使用了yield则该方法就是生成器方法；函数中使用了yield则该函数就是生成器函数，通过yield返回的对象就是生成器对象。
生成器一定是迭代器，因为yield会自动创建__iter__()和__next__()方法。
生成器函数（或方法）通过yield自动保存函数（或方法）的执行状态（局部变量和执行位置）。每次调用next()，生成器会执行一次，直到遇到yield，此时生成器会暂停并返回yield后的值。
上述文档中没有的一点：调用生成器函数（或方法）时，它并不会立即执行，而是返回一个生成器对象。每次调用next()时，生成器函数（或方法）才会开始执行并返回下一个值。
生成器没有更多值时，继续调用next()方法会抛出StopIteration异常，这正是迭代器协议的要求。

生成器表达式

官方文档

Some simple generators can be coded succinctly as expressions using a syntax similar to list comprehensions but with parentheses instead of square brackets. These expressions are designed for situations where the generator is used right away by an enclosing function. Generator expressions are more compact but less versatile than full generator definitions and tend to be more memory friendly than equivalent list comprehensions.

Except for one thing. Module objects have a secret read-only attribute called __dict__ which returns the dictionary used to implement the module’s namespace; the name __dict__ is an attribute but not a global name. Obviously, using this violates the abstraction of namespace implementation, and should be restricted to things like post-mortem debuggers.

理解

生成器表达式是用圆括号()包裹的表达式，其语法类似于列表推导式
生成器一旦传入该函数，数据就会逐个生成并被函数处理，而不是在函数内部创建一个完整的数据集（像列表推导式那样一次性生成所有数据）。

实战

创建生成器函数

通过生成器迭代输出学生信息。

def get_student_info(stu_list):
    print("执行了next()方法。")
    #通过该变量验证：通过yield自动保存函数（或方法）的执行状态（局部变量和执行位置）
    index = 0
    for stu in stu_list:
        yield index,stu
        index += 1

if __name__ == '__main__':
    stu_list = [
        {"name": "张三", "age": 12, "sex": True},
        {"name": "李四", "age": 13, "sex": True},
        {"name": "王五", "age": 14, "sex": False}
    ]
    gen_obj=get_student_info(stu_list)
    #输出：<class 'generator'>
    print(type(gen_obj))

    item=next(gen_obj)
    #输出：tuple；如果只是返回学生对象，则输出：<class 'dict'>
    print(type(item))
    # 输出：{'name': '张三', 'age': 12, 'sex': True}
    print(item)
    #输出：{'name': '李四', 'age': 13, 'sex': True}
    print(next(gen_obj))
    #输出：{'name': '王五', 'age': 14, 'sex': False}
    print(next(gen_obj))
    #报错：StopIteration
    #print(next(gen_obj))

优雅地遍历生成器中的元素

优化需求一，要求用循环的方式优雅地输出所有学生信息（不通过StopIteration异常来确定是否还有下一个元素）。

def get_student_info(stu_list):
    print("执行了next()方法。")
    index = 0
    for stu in stu_list:
        yield index, stu
        index += 1

if __name__ == '__main__':
    stu_list = [
        {"name": "张三", "age": 12, "sex": True},
        {"name": "李四", "age": 13, "sex": True},
        {"name": "王五", "age": 14, "sex": False}
    ]
    # 创建生成器对象
    gen_obj = get_student_info(stu_list)
    # 使用for循环自动获取生成器中的所有元素
    for index, student in gen_obj:
        print(f"学生 {index}: {student}")

生成器表达式

介绍生成器表达式的使用方法，并和列表推导式相互对比。

def add_num(gen):
    print("开始执行函数……")
    result = sum(gen)
    return result

if __name__ == '__main__':
    my_list = [1, 2, 3, 4, 5, 6, 7, 8, 9]
    #生成器表达式：计算每个元素地平方
    gen = (x * x for x in my_list)
    #输出：<class 'generator'>
    print(type(gen))
    result = add_num(gen)
    #输出：285
    print(result)

    #这是列表推导式
    tow_list=[x*x for x in my_list]
    #输出：<class 'list'>
    print(type(tow_list))
    #输出：[1, 4, 9, 16, 25, 36, 49, 64, 81]
    print(tow_list)