Python Cookbook 20160513

最新推荐文章于 2025-09-10 17:59:49 发布

FreesailX10i

最新推荐文章于 2025-09-10 17:59:49 发布

阅读量354

点赞数

CC 4.0 BY-SA版权

分类专栏： Python Cookbook 文章标签： python

本文链接：https://blog.youkuaiyun.com/FreesailX10i/article/details/51396426

Python Cookbook 专栏收录该内容

1 篇文章

订阅专栏

本文介绍了解包、收集最大最小项及优先队列等实用的数据处理技巧。这些技巧可以帮助程序员更高效地处理各种数据结构，包括列表、元组、字典等。

unpacking一个iterable的对象到多个变量

Any iterable object, such as sequence(list, tuple), strings, files.
Only requirement: num of variables match the iterable object

p = (4, 5)
x, y = p

data = [ 'ACME', 50, 91.1, (2012, 12, 21) ]
name, shares, price, date = data

s = 'Hello'
a, b, c, d, e = s

want to discard some values? pick a throwaway variable name for it

unpacking一个任意长度的iterable对象

Problem: iterable object的长度不定，未知或过长
Solution: star expression *var
注意：*var用于unpacking时，把任意个element搜集成一个list
*var用在函数argument list，会将任意个参数搜集成一个tuple

# *var在中间
first, *middle, last = grades
# *var在最后
record = ('Dave', 'dave@example.com', '773-555-1212', '847-555-1212')
name, email, *phone_numbers = record
# *var在前面
*trailing, current = [10, 8, 7, 1, 9, 5, 10, 3]

一个很好的例子

records = [
('foo', 1, 2),
('bar', 'hello'),
('foo', 3, 4),
]

def do_foo(x, y):
    print('foo', x, y)

def do_bar(s):
    print('bar', s)

for tag, *args in records: 

# step1: tag, *args = ('foo', 1, 2)
# step2: tag, *args = ('bar', 'hello)
# ...

    if tag == 'foo':
        do_foo(*args)
    elif tag == 'bar':
        do_bar(*args)

#这里将*args作为函数的parameter,和上面说的*var用于函数的argument list不一样
#*args做函数的parameter可以给多个arguments赋值，相当于 x, y = *args（a list), 原理同unpacking一个iterable的对象到多个变量

def sum(items):
    head, *tail = items
    return head + sum(tail) if tail else head

#最后一句 = 
    if tail != []:
       1return head + sum(tail)
    else:
       return head

Keeping the Last N Items

一个能自动维护长度的队列 collections.deque

from collections import deque
a_fix_length_queue = deque(maxlen = queue_len)

先简单理解一下带yield的函数，后面再看：

http://www.ibm.com/developerworks/cn/opensource/os-cn-python-yield/
yield 的作用就是把一个函数变成一个 generator，带有 yield 的函数不再是一个普通函数，Python 解释器会将其视为一个 generator，调用 fab(5) 不会执行 fab 函数，而是返回一个 iterable 对象！在 for 循环执行时，每次循环都会执行 fab 函数内部的代码，执行到 yield b 时，fab 函数就返回一个迭代值，下次迭代时，代码从 yield b 的下一条语句继续执行，而函数的本地变量看起来和上次中断执行前是完全一样的，于是函数继续执行，直到再次遇到 yield

#无限长度的deque, 可以在两端pop or append
q = deque()
#在队尾
q.pop()
q.append()
#在队头
q.popleft()
q.appendleft()
#在队头的插入、删除操作比对list进行相同的操作要快

Finding the Largest or Smallest N Items

The heapq module

make a list of the largest or smallest N items in a collection:
- heapq.nlargest(N, a_collection)
- heapq.nsmallest(N, a_collection)

import heapq
nums = [1, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
print(heapq.nlargest(3, nums)) # Prints [42, 37, 23]
print(heapq.nsmallest(3, nums)) # Prints [-4, 1, 2]

对于包含复杂结构的collection, 需要用到 key 这个参数，一般key指定一个函数(比如lambda函数

portfolio = [
{'name': 'IBM', 'shares': 100, 'price': 91.1},
{'name': 'AAPL', 'shares': 50, 'price': 543.22},
{'name': 'FB', 'shares': 200, 'price': 21.09},
{'name': 'HPQ', 'shares': 35, 'price': 31.75},
{'name': 'YHOO', 'shares': 45, 'price': 16.35},
{'name': 'ACME', 'shares': 75, 'price': 115.65}
]
cheap = heapq.nsmallest(3, portfolio, key=lambda s: s['price'])
expensive = heapq.nlargest(3, portfolio, key=lambda s: s['price'])

nlargest()和nsmallest的内部原理：

用 heapq.heapify(a_list) reorder a list as a heap (The most important feature of a heap is that heap[0] is always the smallest item, 注意只能保证第一个元素时最小的，后面的大小顺序无法保证)
subsequent items can be easily found using the heapq.heappop(a_list) method, which
pops off the first item and replaces it with the next smallest item

几种找max\min的方法总结：

sort first then take a slice when N(要找前N个) 和collection大小差不多
nlargest和nsmallest when N远小于size of collection, 当N=1时，此方法快于min()\max()

1.5. Implementing a Priority Queue

应用heapq module中的函数去完成

    import heapq

    class PriorityQueue:
        def __init__(self):
            self._queue = []
            self._index = 0

        def push(self, item, priority):
            heapq.heappush(self._queue, (-priority, self._index, item))
            #加入_index是为了处理priority相同的情况，index小（先进的）先被pop
            self._index += 1

        def pop(self):
            return heapq.heappop(self._queue)[-1] #the item