Python列表详解-优快云博客

查看全部 Python3 基础教程

深入列表

列表类型的所有方法

list.append(x)

Add an item to the end of the list. Equivalent to a[len(a):] = [x].

list.extend(iterable)

Extend the list by appending all the items from the iterable. Equivalent to a[len(a):] = iterable.

list.insert(i, x)

Insert an item at a given position. The first argument is the index of the element before which to insert, so a.insert(0, x) inserts at the front of the list, and a.insert(len(a), x) is equivalent to a.append(x).

list.remove(x)

Remove the first item from the list whose value is equal to x. It raises a ValueError if there is no such item.

list.pop([i])

Remove the item at the given position in the list, and return it. If no index is specified, a.pop() removes and returns the last item in the list. (The square brackets around the i in the method signature denote that the parameter is optional, not that you should type square brackets at that position. You will see this notation frequently in the Python Library Reference.)

list.clear()

Remove all items from the list. Equivalent to del a[:].

list.index(x[, start[, end]])

Return zero-based index in the list of the first item whose value is equal to x. Raises a ValueError if there is no such item.

The optional arguments start and end are interpreted as in the slice notation and are used to limit the search to a particular subsequence of the list. The returned index is computed relative to the beginning of the full sequence rather than the start argument.

list.count(x)

Return the number of times x appears in the list.

list.sort(*, key=None, reverse=False)

Sort the items of the list in place (the arguments can be used for sort customization, see sorted() for their explanation).

list.reverse()

Reverse the elements of the list in place.

list.copy()

Return a shallow copy of the list. Equivalent to a[:].

列表大部分方法使用示例：

>>> fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana']
>>> fruits.count('apple')
2
>>> fruits.count('tangerine')
0
>>> fruits.index('banana')
3
>>> fruits.index('banana', 4)  # Find next banana starting a position 4
6
>>> fruits.reverse()
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']
>>> fruits.append('grape')
>>> fruits
['banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange', 'grape']
>>> fruits.sort()
>>> fruits
['apple', 'apple', 'banana', 'banana', 'grape', 'kiwi', 'orange', 'pear']
>>> fruits.pop()
'pear'

像只修改列表而没有打印返回值的 insert, remove 或 sort() 方法，会返回默认的 None。这是 Python 中对所有可变数据结构的设计原则，其他语言中可能返回该可变对象，以便后续方法链式调用，如 d->insert("a")->remove("b")->sort();。

并不是所有对象都支持排序和比较，例如 [None, 'hello', 10] 不能排序，因为整数类型不能和字符串类型比较，并且 None 不能和其他类型比较。

有一些类型没有明确的排序关系，例如 3+4j < 5+7j 并不是一个有效的比较。

把列表当作堆栈使用

通过列表方法很容易将列表作为一个堆栈来用，堆栈是一种“后进先出”的数据结构。用 append() 方法可以把一个元素添加到堆栈顶，用无需指定索引的 pop() 方法可以把一个元素从堆栈顶释放出来。

>>> stack = [3, 4, 5]
>>> stack.append(6)
>>> stack.append(7)
>>> stack
[3, 4, 5, 6, 7]
>>> stack.pop()
7
>>> stack
[3, 4, 5, 6]
>>> stack.pop()
6
>>> stack.pop()
5
>>> stack
[3, 4]

把列表当作队列使用

也能将列表用作队列，队列是一种“先进先出”的数据结构。虽然从列表末尾 append() 和 pop() 很快，但是从列表头部 insert() 和 pop() 就必须让所有列表元素移动一位，所以列表用作队列并不是高效的。

可以用 collections.deque 实现一个队列，它支持在列表两端快速 append() 和 pop()。

>>> from collections import deque
>>> queue = deque(["Eric", "John", "Michael"])
>>> queue.append("Terry")           # Terry arrives
>>> queue.append("Graham")          # Graham arrives
>>> queue.popleft()                 # The first to arrive now leaves
'Eric'
>>> queue.popleft()                 # The second to arrive now leaves
'John'
>>> queue                           # Remaining queue in order of arrival
deque(['Michael', 'Terry', 'Graham'])

函数化编程工具(Python2)

使用列表时有三个有用的函数：filter(), map() 和 reduce()。

filter

filter(function, sequence) 返回一个序列(如果可能的话，会返回相同的类型)，该序列是由给定 sequence 中所有调用 function(item) 后返回值为 true 的元素组成。

以下程序可以计算一些素数：

>>> def f(x): return x % 2 != 0 and x % 3 != 0
...
>>> filter(f, range(2, 25))
[5, 7, 11, 13, 17, 19, 23]

map

map(function, sequence) 为每一个元素依次调用 function(item) 并将返回值组成一个列表返回。

以下程序计算一些立方数：

>>> def cube(x): return x*x*x
...
>>> map(cube, range(1, 11))
[1, 8, 27, 64, 125, 216, 343, 512, 729, 1000]

可以传入多个序列，函数也必须要有对应数量的参数，调用时的参数值为各序列上对应的元素(如果某些序列比其它的短，对应参数值就用 None 来代替)。

>>> seq = range(8)
>>> def add(x, y): return x+y
...
>>> map(add, seq, seq)
[0, 2, 4, 6, 8, 10, 12, 14]

reduce

reduce(func, sequence) 返回一个单值，它是这样构造的：首先用序列的前两个元素调用该二元函数，再用返回值和下一个序列元素调用该函数，依次执行下去。

以下程序计算 1 到 10 的整数之和：

>>> def add(x,y): return x+y
...
>>> reduce(add, range(1, 11))
55

如果序列中只有一个元素，就返回该元素，如果序列是空的，就抛出一个异常。

可以传入第三个参数做为起始值。如果序列是空的，就返回起始值，否则函数会先接收起始值和序列的第一个元素，然后是返回值和下一个元素，依此类推。例如：

>>> def sum(seq):
...     def add(x,y): return x+y
...     return reduce(add, seq, 0)
...
>>> sum(range(1, 11))
55
>>> sum([])
0

不要像示例中这样定义 sum()，因为合计数值是一个通用的需求，在 2.3 版中，提供了内置的 sum(sequence) 函数。

列表生成式

列表生成式提供了一种简明方式来创建列表，无需使用 map(), filter() 以及 lambda。列表生成式一般用于：

创建新的列表，其中每个元素都是对另一个序列或可迭代对象中的每个成员进行某些操作的结果。
从给定元素序列创建满足特定条件的子序列。

创建一个平方数列表：

>>> squares = []
>>> for x in range(10):
...     squares.append(x**2)
...
>>> squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

以上创建或覆盖一个循环后仍然存在变量 x。可以用以下方式更好地创建一个平方数列表：

squares = list(map(lambda x: x**2, range(10)))

以上等同于：

squares = [x**2 for x in range(10)] # 更简明和有可读性

一个列表生成式是一对方括号，包含一个表达式，后面跟着一个 for 子句，然后是零或多个 for 或 if 子句。列表生成式的结果是在表达式后面的 for 和 if 子句的上下文中计算该表达式后得到的新列表。

以下列表生成式将两个列表中不相等的元素组合到一起：

>>> [(x, y) for x in [1,2,3] for y in [3,1,4] if x != y]
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

以上等同于：

>>> combs = []
>>> for x in [1,2,3]:
...     for y in [3,1,4]:
...         if x != y:
...             combs.append((x, y))
...
>>> combs
[(1, 3), (1, 4), (2, 3), (2, 1), (2, 4), (3, 1), (3, 4)]

注意这两个段中 for 和 if 语句的顺序是相同的。

如果表达式是一个元组，例如前面例子中的 (x, y)，就必须加上圆括号。

>>> vec = [-4, -2, 0, 2, 4]
>>> # create a new list with the values doubled
>>> [x*2 for x in vec]
[-8, -4, 0, 4, 8]
>>> # filter the list to exclude negative numbers
>>> [x for x in vec if x >= 0]
[0, 2, 4]
>>> # apply a function to all the elements
>>> [abs(x) for x in vec]
[4, 2, 0, 2, 4]
>>> # call a method on each element
>>> freshfruit = ['  banana', '  loganberry ', 'passion fruit  ']
>>> [weapon.strip() for weapon in freshfruit]
['banana', 'loganberry', 'passion fruit']
>>> # create a list of 2-tuples like (number, square)
>>> [(x, x**2) for x in range(6)]
[(0, 0), (1, 1), (2, 4), (3, 9), (4, 16), (5, 25)]
>>> # the tuple must be parenthesized, otherwise an error is raised
>>> [x, x**2 for x in range(6)]
  File "<stdin>", line 1, in <module>
    [x, x**2 for x in range(6)]
               ^
SyntaxError: invalid syntax
>>> # flatten a list using a listcomp with two 'for'
>>> vec = [[1,2,3], [4,5,6], [7,8,9]]
>>> [num for elem in vec for num in elem]
[1, 2, 3, 4, 5, 6, 7, 8, 9]

列表生成式可以包含复杂的表达式和内嵌函数：

>>> from math import pi
>>> [str(round(pi, i)) for i in range(1, 6)]
['3.1', '3.14', '3.142', '3.1416', '3.14159']

内嵌列表生成式

列表生成式中的初始表达式可以是任意表达式，可以包含另一个列表生成式。

用一个包含 3 个长度为 4 的列表实现一个 3X4 的矩阵：

>>> matrix = [
...     [1, 2, 3, 4],
...     [5, 6, 7, 8],
...     [9, 10, 11, 12],
... ]

以下列表生成式将矩阵转置：

>>> [[row[i] for row in matrix] for i in range(4)]
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

如上一节所见，内嵌列表生成式是在位于其后面的 for 的上下文中运算的，因此以上示例等同于下面：

>>> transposed = []
>>> for i in range(4):
...     transposed.append([row[i] for row in matrix])
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

依次地，以上代码等同于下面：

>>> transposed = []
>>> for i in range(4):
...     # the following 3 lines implement the nested listcomp
...     transposed_row = []
...     for row in matrix:
...         transposed_row.append(row[i])
...     transposed.append(transposed_row)
...
>>> transposed
[[1, 5, 9], [2, 6, 10], [3, 7, 11], [4, 8, 12]]

实际应用中，应该选择内建的函数而不是复杂的流程语句。zip() 函数很适合处理这一类问题：

>>> list(zip(*matrix))
[(1, 5, 9), (2, 6, 10), (3, 7, 11), (4, 8, 12)]

以上参数前的 * 号可以参考分拆参数列表。

del 语句

del 语句可以用于删除给定索引的列表元素，还可以用于从列表中删除切片或清空整个列表(在此前是通过赋值一个空列表给切片来实现的)。

>>> a = [-1, 1, 66.25, 333, 333, 1234.5]
>>> del a[0]
>>> a
[1, 66.25, 333, 333, 1234.5]
>>> del a[2:4]
>>> a
[1, 66.25, 1234.5]
>>> del a[:]
>>> a
[]

del 还能用于删除整个变量：

>>> del a

在给该变量赋于另一个值之前，再次引用该变量会发生错误。

元组和序列

列表和字符串有许多共同属性，比如索引和切片操作。它们是序列数据类型的两种(参考序列类型——列表、元组、范围)。

Python 是一个不断演进的语言，以后可能还会加入其他序列数据类型。

元组由许多被逗号分隔的值组成：

>>> t = 12345, 54321, 'hello!'
>>> t[0]
12345
>>> t
(12345, 54321, 'hello!')
>>> # Tuples may be nested:
... u = t, (1, 2, 3, 4, 5)
>>> u
((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))
>>> # Tuples are immutable:
... t[0] = 88888
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'tuple' object does not support item assignment
>>> # but they can contain mutable objects:
... v = ([1, 2, 3], [3, 2, 1])
>>> v
([1, 2, 3], [3, 2, 1])

元组在输出时总是被圆括号括起来的，以便于正确表达嵌套元组。在输入时可能有或没有括号都可以，尽管通常括号都是必须的(如果元组是一个较大表达式的一部分)。

不能给元组的个别元素赋值，但可以通过创建包含可变对象的元组来实现，如列表。

虽然元组看起来与列表相似，但它们通常用于不同的情况和不同的目的。

元组是不可变的，通常包含异质性元素序列，通过解包或索引(对于名称元组甚至通过属性)访问。
列表是可变的，其元素通常是同质的，通过迭代列表来访问。

一个特殊的问题是构造包含零个或一个元素的元组：为了适应这种情况，语法上有一些额外的改变。一对空的括号可以创建空元组；要创建一个单元素元组可以在值后面跟一个逗号(在括号中放入一个单值是不够的)。这种语法看着丑，但是高效。例如：

>>> empty = ()
>>> singleton = 'hello',    # <-- note trailing comma
>>> len(empty)
0
>>> len(singleton)
1
>>> singleton
('hello',)

语句 t = 12345, 54321, 'hello!' 是元组封装的一个例子：值 12345， 54321 和 'hello!' 被包装到一起放进元组。其逆操作可能是这样：

>>> x, y, z = t

这种操作适合称为序列拆封，用于等号右侧的任意序列。序列拆封要求等号左侧的变量数目与序列的元素个数相同。注意多重赋值操作其实只是元组封装和序列拆封的一个结合。

set 集合

集合是一个没有重复元素的无序集。基本的用途包括进行成员资格测试和消除重复的条目。集合对象还支持数学运算，如联合、交集、差分和对称差分。

大括号或 set() 函数可用于创建集合。注意：如果创建空集，必须使用 set()，而不是 {}，后者用于创建一个空字典。

>>> basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}
>>> print(basket)                      # show that duplicates have been removed
{'orange', 'banana', 'pear', 'apple'}
>>> 'orange' in basket                 # fast membership testing
True
>>> 'crabgrass' in basket
False

>>> # Demonstrate set operations on unique letters from two words
...
>>> a = set('abracadabra')
>>> b = set('alacazam')
>>> a                                  # unique letters in a
{'a', 'r', 'b', 'c', 'd'}
>>> a - b                              # letters in a but not in b
{'r', 'd', 'b'}
>>> a | b                              # letters in a or b or both
{'a', 'c', 'r', 'd', 'b', 'm', 'z', 'l'}
>>> a & b                              # letters in both a and b
{'a', 'c'}
>>> a ^ b                              # letters in a or b but not both
{'r', 'd', 'b', 'm', 'z', 'l'}

类似于列表生成式，集合也支持生成式：

>>> a = {x for x in 'abracadabra' if x not in 'abc'}
>>> a
{'r', 'd'}

字典

字典数据类型可以参考映射类型—dict。字典在其他语言中有时用作“联合内存”(associative memories)或“联合数组”(associative arrays)。与由一系列数字进行索引的序列不同，字典是由键进行索引的，键可以是任何不可变的类型。字符串和数字总能用作键。如果元组只包含字符串、数字或元组，则可以用作键；如果元组直接或间接地包含任何可变对象，则不能用作键。不能使用列表作为键，因为可以使用索引赋值、切片赋值或类似 append() 和 extend() 等方法来修改列表。

最好是将字典理解为键:值对集合，并要求键在字典中是唯一的。一对大括号会创建一个空的字典。在大括号中放置一个以逗号分隔的键:值对列表作为字典的初始值，这也是字典输出的方式。

字典的主要操作是依据某个键来存储和析取值。也可以用 del 来删除一个键:值对。如果使用已经存在的键存储值，以前为该键分配的值就会被忘记。从一个不存在的键中读取值会导致错误。

对一个字典执行 list(d) 会返回该字典中所有的键组成的列表，元素顺序按插入字典时的顺序。使用 sorted(d) 会返回有序的键列表。使用 in 来检查某个键是否存在于字典中。

>>> tel = {'jack': 4098, 'sape': 4139}
>>> tel['guido'] = 4127
>>> tel
{'jack': 4098, 'sape': 4139, 'guido': 4127}
>>> tel['jack']
4098
>>> del tel['sape']
>>> tel['irv'] = 4127
>>> tel
{'jack': 4098, 'guido': 4127, 'irv': 4127}
>>> list(tel)
['jack', 'guido', 'irv']
>>> sorted(tel)
['guido', 'irv', 'jack']
>>> 'guido' in tel
True
>>> 'jack' not in tel
False

dict() 构造器可以直接从键:值对序列构造字典：

>>> dict([('sape', 4139), ('guido', 4127), ('jack', 4098)])
{'sape': 4139, 'guido': 4127, 'jack': 4098}

此外，可以用字典生成式从任意键:值对表达式创建字典：

>>> {x: x**2 for x in (2, 4, 6)}
{2: 4, 4: 16, 6: 36}

当键是简单的字符串时，有时使用关键字参数指定键:值对更容易：

>>> dict(sape=4139, guido=4127, jack=4098)
{'sape': 4139, 'guido': 4127, 'jack': 4098}

循环中的技巧

items()

循环字典时，键和相应值可以用 items() 方法同时取出来。

>>> knights = {'gallahad': 'the pure', 'robin': 'the brave'}
>>> for k, v in knights.items():
...     print(k, v)
...
gallahad the pure
robin the brave

enumerate()

循环序列时，索引位置和相应值可以用 enumerate() 函数同时取出来。

>>> for i, v in enumerate(['tic', 'tac', 'toe']):
...     print(i, v)
...
0 tic
1 tac
2 toe

zip()

同时循环两个或以上序列时，可以用 zip() 函数使得迭代变量成对。

>>> questions = ['name', 'quest', 'favorite color']
>>> answers = ['lancelot', 'the holy grail', 'blue']
>>> for q, a in zip(questions, answers):
...     print('What is your {0}?  It is {1}.'.format(q, a))
...
What is your name?  It is lancelot.
What is your quest?  It is the holy grail.
What is your favorite color?  It is blue.

reversed()

可以用 reversed() 函数反向迭代一个序列。

>>> for i in reversed(range(1, 10, 2)):
...     print(i)
...
9
7
5
3
1

sorted()

可以用 sorted() 函数返回一个排好序的序列，这并不改变原序列。

>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> for i in sorted(basket):
...     print(i)
...
apple
apple
banana
orange
orange
pear

set()

可以用 set() 对序列去重，习惯上用 sorted() 结合 set() 有序迭代一个去重的序列。

>>> basket = ['apple', 'orange', 'apple', 'pear', 'orange', 'banana']
>>> for f in sorted(set(basket)):
...     print(f)
...
apple
banana
orange
pear

建议

如果在迭代列表的同时想要修改它，简单安全的方式是创建一个新的。

>>> import math
>>> raw_data = [56.2, float('NaN'), 51.7, 55.3, 52.5, float('NaN'), 47.8]
>>> filtered_data = []
>>> for value in raw_data:
...     if not math.isnan(value):
...         filtered_data.append(value)
...
>>> filtered_data
[56.2, 51.7, 55.3, 52.5, 47.8]

深入条件控制

用于 while 和 if 中的条件不仅仅只是比较操作。

比较操作符 in 和 not in 检查某个值是否存在于序列中。操作符 is 和 is not 比较两个对象是否是同一个对象。所有的比较操作符优先级相同，且低于所有的数值运算符。

比较操作可以链起来，例如 a < b == c 检测是否 a 小于 b 并且 b 等于 c。

比较操作可以用布尔运算符 and 和 or 组合起来，比较操作的结果或任何布尔表达式可以用 not 取反义。这些运算符的优先级低于比较操作符，其中，not 优先级最高，or 优先级最低，所以 A and not B or C 等同于 (A and (not B)) or C。一如既往，圆括号可以按需要来组织。

布尔运算符 and 和 or 被称为短路运算符：它们的参数按从左到右顺序被运算，一旦确定结果就立即停止。例如，如果 A 和 C 是 true，但 B 是 false，那么 A and B and C 不会计算表达式 C。当用作一般值而非布尔值时，短路运算符的返回值会是最后计算的参数。

可以把比较操作或其它布尔表达式的结果赋给一个变量，例如：

>>> string1, string2, string3 = '', 'Trondheim', 'Hammer Dance'
>>> non_null = string1 or string2 or string3
>>> non_null
'Trondheim'

注意 Python 与 C 不同，表达式内部的赋值操作必须显示使用 := 操作符，这避免了在 C 程序中遇到的常见问题：在表达式中想要使用 ==，但是 = 会引起歧义。

比较序列和其他类型

序列对象通常可以与具有相同序列类型的其他对象进行比较。比较操作按字典序进行：首先比较前两个项，如果它们不同，则确定比较的结果；如果它们相等，则比较接下来的两项，以此类推，直到其中一个序列结束。

如果要比较的两项本身就是相同类型的序列，则将递归地进行字典序比较。
如果两个序列的所有项都比较相等，则认为两个序列相等。
如果一个序列是另一个序列的开始子序列，则较短的序列较小。
字符串的字典序使用 Unicode 码点数来对单个字符进行排序。

相同类型序列的比较示例：

(1, 2, 3)              < (1, 2, 4)
[1, 2, 3]              < [1, 2, 4]
'ABC' < 'C' < 'Pascal' < 'Python'
(1, 2, 3, 4)           < (1, 2, 4)
(1, 2)                 < (1, 2, -1)
(1, 2, 3)             == (1.0, 2.0, 3.0)
(1, 2, ('aa', 'ab'))   < (1, 2, ('abc', 'a'), 4)