Python数据结构实用技巧详解：从切片到列表推导式-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00335/article/details/148360809

Python数据结构实用技巧详解：从切片到列表推导式

本文基于数据科学项目中常用的Python数据结构操作技巧，系统性地介绍切片、范围生成、二分查找、排序、枚举、压缩和列表推导式等核心功能，帮助读者掌握高效处理数据的实用方法。

切片(Slice)是Python中处理序列类型（列表、元组、字符串等）最强大的工具之一。其基本语法为[start:end:step]，其中start包含在结果中，而end不包含。

seq = 'Monty Python'
print(seq[6:10])  # 输出: 'Pyth' (索引6到9)
print(seq[:5])    # 输出: 'Monty' (从头到索引4)
print(seq[6:])    # 输出: 'Python' (索引6到末尾)

步长：间隔取值

print(seq[::2])  # 输出: 'MnyPto' (每隔一个字符)

反转序列

print(seq[::-1])  # 输出: 'nohtyP ytnoM'

切片不仅可以取值，还可以赋值，且赋值长度不必与原切片长度一致：

seq = [1, 1, 2, 3, 5, 8, 13]
seq[5:] = ['H', 'a', 'l', 'l']  # 替换最后两个元素为四个新元素
print(seq)  # 输出: [1, 1, 2, 3, 5, 'H', 'a', 'l', 'l']

生成整数序列是编程中的常见需求，Python提供了两种方式：

print(range(10))       # 0-9
print(range(0,20,3))   # 0,3,6,9,12,15,18

对于大范围，xrange（Python 3中的range）更高效，因为它生成迭代器而非完整列表：

sum = 0
for i in xrange(100000):  # 内存友好
    if i % 2 == 0:
        sum += 1

bisect模块提供了高效的二分查找算法，但要求输入序列已排序：

import bisect
seq = [1, 2, 2, 3, 5, 13]
pos = bisect.bisect(seq, 8)  # 找到插入位置
bisect.insort(seq, 8)        # 插入并保持有序

Python提供了两种排序方式：

seq = [1, 5, 3, 9, 7, 6]
seq.sort()  # 直接修改原列表

new_seq = sorted([2, 5, 1, 8, 7, 9])

按字符串长度排序：

seq = ['the', 'quick', 'brown', 'fox', 'jumps', 'over']
seq.sort(key=len)

print(list(reversed([1, 2, 3])))  # [3, 2, 1]

同时获取索引和值：

for i, string in enumerate(['foo', 'bar', 'baz']):
    print(f"{i}: {string}")

zip函数将多个序列"压缩"成元组序列：

numbers = [1, 2, 3]
words = ['one', 'two', 'three']
print(list(zip(numbers, words)))  # [(1, 'one'), (2, 'two'), (3, 'three')]

解压操作：将行转为列

pairs = [(1, 'one'), (2, 'two'), (3, 'three')]
nums, words = zip(*pairs)

列表推导式是Python最优雅的特性之一，它结合了过滤和转换操作：

strings = ['foo', 'bar', 'baz', 'b']
print([s.upper() for s in strings if s.startswith('b')])  # ['BAR', 'BAZ', 'B']

处理嵌套结构：

matrix = [(1, 2, 3), (4, 5, 6), (7, 8, 9)]
print([x for row in matrix for x in row])  # 扁平化: [1,2,3,4,5,6,7,8,9]

类似地，字典也有推导式：

words = ['one', 'two', 'three']
print({i: word for i, word in enumerate(words)})
# {0: 'one', 1: 'two', 2: 'three'}

本文系统介绍了Python中处理数据结构的核心工具，从基础的切片操作到高级的列表推导式，这些技巧构成了Python数据处理的基石。掌握这些方法不仅能提高代码效率，还能使代码更加简洁优雅。在实际数据科学项目中，灵活运用这些工具可以显著提升开发效率和代码可读性。

创作声明：本文部分内容由AI辅助生成（AIGC），仅供参考