避免劣化的python代码的争议

最新推荐文章于 2021-07-12 17:18:36 发布

最新推荐文章于 2021-07-12 17:18:36 发布 · 237 阅读

文章标签：

#Python #Rails #J#

python 编程专栏收录该内容

12 篇文章

订阅专栏

本文介绍Python中迭代器的高效使用方法，包括xrange与enumerate的区别、xrange与zip的不同应用场景、filter与itertools.ifilter的选择、imap与map的对比、生成器的使用以及groupby的应用等。

[size=large][color=red][b]1.xrange and enumerate[/b][/color][/size]

[b]enumerate[/b]：enumerate is useful for obtaining an indexed list
[b]xrange[/b]： generates the numbers in the range on demand. For looping, this is slightly faster than range() and more memory efficient.

根据性能比较还是xrange 好一点，如果数据量不大，用哪个都可以，哪个更符合要求您就可以使用哪个,而且enumerate和xrange同样使用的是next()方法，只是对返回数据的封装不同。

seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i in xrange(len(seq)):  
...     #print item, i
...     pass
... print datetime.datetime.now() - n
0:00:00.066000
>>> seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i in xrange(len(seq)):  
...     #print item, i
...     pass
... print datetime.datetime.now() - n
0:00:00.067000

print datetime.datetime.now() - n
0:00:00.142000
>>> seq = [i for i in xrange(1000000)]
... n = datetime.datetime.now()
... for i,item in enumerate(seq):  
...     #print item, i
...     pass
... print datetime.datetime.now() - n
0:00:00.142000

[size=large][color=red][b]2.xrange and zip[/b][/color][/size]

[b]在这里zip和xrange的功能是不一样滴，不能做功能上的比较[/b]


劣化代码:
for i in xrange(len(seq1)):  
  foo(seq1[i], seq2[i])  
 推荐代码:
for i, j in zip(seq1, seq2)  
  foo(i, j)  
更高效：  
for i, j in itertools.izip(seq1, seq2):   
  foo(i, j)

这里是zip的定义:
[b]zip(...)[/b]
zip(seq1 [, seq2 [...]]) -> [(seq1[0], seq2[0] ...), (...)]
Return a list of tuples, where each tuple contains the i-th element
from each of the argument sequences. The[b][color=red] returned list is truncated
in length to the length of the shortest argument sequence[/color][/b].
[color=red][b]如果用zip做代替结果可能会产生错误！，如下代码没有list中的6[/b]
[/color]


>>> zip([4,5,6],[2,3])
2: [(4, 2), (5, 3)]

[color=red]而且zip的性能也不是太理想：[/color]

import datetime
... seq1 = [i for i in xrange(1000000)]
... seq2 = [j for j in xrange(1000000)]
... n = datetime.datetime.now()
... for i, j in zip(seq1, seq2):
...     pass
... print datetime.datetime.now() - n
0:00:01.713000

>>> import itertools
>>> import datetime
... seq1 = [i for i in xrange(1000000)]
... seq2 = [j for j in xrange(1000000)]
... n = datetime.datetime.now()
... for i, j in itertools.izip(seq1, seq2):
...     pass
... print datetime.datetime.now() - n
0:00:00.159000

[size=large][color=red][b]3.filter[/b][/color][/size]
不过下面这个写法还是挺好的,可以提高复用性

劣化代码:
for i in seq:  
  if pred(i):   
    foo(i)  
 推荐代码：
for i in itertools.ifilter(pred, seq):  
  foo(i)

[size=large][color=red][b]4.imap[/b][/color][/size]
map and iterator.imap 也是有很好的复用性，
但是imap和map的定义不同：

itertools.imap(function, *iterables)
Make an iterator that computes the function using arguments from each of the iterables. If function is set to None, then imap() returns the arguments as a tuple. Like map() [b][color=red]but stops when the shortest iterable is exhausted[/color] [/b]instead of filling in None for shorter iterables. The reason for the difference is that infinite iterator arguments are typically an error for map() (because the output is fully evaluated) but represent a common and useful way of supplying arguments to imap()

两者的区别


for i in map(pow, (2,3,10), (5,2)):
...     print i
Traceback (most recent call last):
  File "<pyshell#30>", line 1, in <module>
    for i in map(pow, (2,3,10), (5,2)):
TypeError: unsupported operand type(s) for ** or pow(): 'int' and 'NoneType'


> for i in itertools.imap(pow, (2,3,10), (5,2)):
...     print i
32
9

[size=large][color=red][b]5.Generators[/b][/color][/size]
Since Python 2.2, generators provide an elegant way to write simple and efficient
code for functions that return a list of elements. Based on the yield directive, they
allow you to pause a function and return an intermediate result. The function saves
its execution context and can be resumed later if necessary.
For example (this is the example provided in the PEP about iterators), the Fibonacci
series can be written with an iterator:

>>> def fibonacci():
... a, b = 0, 1
... while True:
... yield b
... a, b = b, a + b
...
>>> fib = fibonacci()
>>> fib.next()
1
>>> fib.next()
1
>>> fib.next()
2
>>> [fib.next() for i in range(10)]
[3, 5, 8, 13, 21, 34, 55, 89, 144, 233]


>>> def my_generator():
...     try:
...         yield 'something'
...     except ValueError:
...         yield 'dealing with the exception'
...     finally:
...         print "ok let's clean"
>>> m = my_generator
>>> m = my_generator()
>>> m.next()
26: 'something'
>>> m.throw(ValueError('haha'))
27: 'dealing with the exception'
>>> m.close()
ok let's clean
>>> m.next
28: <method-wrapper 'next' of generator object at 0x01E20D50>
>>> m.next()

[size=large][color=red][b]6.groupby[/b][/color][/size]

from itertools import groupby
>>> def compress(data):
... return ((len(list(group)), name)
... for name, group in groupby(data))
...
>>> def decompress(data):
... return (car * size for size, car in data)
...
>>> list(compress('get uuuuuuuuuuuuuuuuuup'))
[(1, 'g'), (1, 'e'), (1, 't'), (1, ' '),
(18, 'u'), (1, 'p')]
>>> compressed = compress('get uuuuuuuuuuuuuuuuuup')
>>> ''.join(decompress(compressed))
'get uuuuuuuuuuuuuuuuuup'