读书笔记：LearningPython第五版（★第九章 Tuples, Files, and Everything Else）

本文深入探讨Python中的数据结构如元组、命名元组，以及文件操作的基础知识，包括打开、读取、写入文件的方法，同时介绍了Pickle、json和struct模块在数据序列化中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

tuple的操作
collections模块中有其他的多功能类型
文件操作
文件本身是有buffer的，而且二进制操作可以寻址seek
文本文件最好的读取方式是直接迭代
文件的文本处理会自动使用utf-8转换编码而且对文本处理会自动使用utf-8转换编码
对象拷贝的方法： 4种
Python内置对象的比较规则
Python内置对象的比较，是递归的
Python对象都是有True和False的
None本身时一个特殊对象，是一个实际内存空间，Python内置的特殊的名称

Chap9 Tuples, Files, and Everything Else

9.1 Tuple

Operation	Interpretation
()	An empty tuple
T = (0,)	A one-item tuple (not an expression)
T = (0, ‘Ni’, 1.2, 3)	A four-item tuple
T = 0, ‘Ni’, 1.2, 3	Python允许省略括号
T = (‘Bob’, (‘dev’, ‘mgr’))	Nested tuples
T = tuple(‘spam’)	Tuple of items in an iterable
T[i]	Index, index of index, slice, length
T[i][j]
T[i:j]
len(T)
`T1 + T2`	Concatenate, repeat
`T * 3`
for x in T: print(x)	Iteration
‘spam’ in T	membership
[x ** 2 for x in T]
T.index(‘Ni’)	Methods in 2.6, 2.7, and 3.X: search, count
T.count(‘Ni’)
namedtuple(‘Emp’, [‘name’, ‘jobs’])	Named tuple extension type

9.2 Named Tuples

>>> from collections import namedtuple # Import extension type
>>> Rec = namedtuple('Rec', ['name', 'age', 'jobs']) # Make a generated class
>>> bob = Rec('Bob', age=40.5, jobs=['dev', 'mgr']) # A named-tuple record
>>> bob
Rec(name='Bob', age=40.5, jobs=['dev', 'mgr'])

>>> bob[0], bob[2] # Access by position
('Bob', ['dev', 'mgr'])
>>> bob.name, bob.jobs # Access by attribute
('Bob', ['dev', 'mgr'])

# 改变成 OrderedDict类型
>>> O = bob._asdict() # Dictionary-like form
>>> O['name'], O['jobs'] # Access by key too
('Bob', ['dev', 'mgr'])
>>> O
OrderedDict([('name', 'Bob'), ('age', 40.5), ('jobs', ['dev', 'mgr'])])

# Tuple Unpacking
name, age, jobs = bob

9.3 File

Operation	Interpretation
	output = open(r’C:\spam’, ‘w’)
input = open(‘data’, ‘r’)	Create input file (‘r’ means read)
input = open(‘data’)	Same as prior line (‘r’ is the default)
aString = input.read()	Read entire file into a single string
`aString = input.read(N)`	Read up to next N characters (or bytes) into a string
aString = input.readline()	Read next line (including \n newline) into a string
aList = input.readlines()	Read entire file into list of line strings (with \n)
`output.write(aString)`	Write a string of characters (or bytes) into file
`output.writelines(aList)`	Write all line strings in a list into file
output.close()	Manual close (done for you when file is collected)
`output.flush()`	Flush output buffer to disk without closing
anyFile.seek(N)	Change file position to offset N for next operation
for line in open(‘data’):	use line File iterators read line by line
open(‘f.txt’, encoding=‘latin-1’)	Python 3.X Unicode text files (str strings)
open(‘f.bin’, ‘rb’)	Python 3.X bytes files (bytes strings)
codecs.open(‘f.txt’, encoding=‘utf8’)	Python 2.X Unicode text files (unicode strings)
open(‘f.bin’, ‘rb’)	Python 2.X bytes files (str strings)

file本身是个iterator

9.3.1 open函数

参数

open(filename, mode, [buffer, encoding] )

filename 文件名
mode:
2.1. w:写
2.2. r：读
2.3. a：追加
2.4. b：二进制
2.5. +：可读可写，常常配合 seek
buffer: 0代表不buffer，直接写入disk

9.3.2 使用file

最优的读取方法是将file对象当作 iterator来遍历
file是有缓存的 buffered: flush()，而且二进制的文件是可寻址的seekable: seek()
readline方法当读到文件尾部的时候，会返回空字符串；区别于空行——空行会返回\n
write()函数会返回写入的字符个数
close函数在CPython中是可选的，因为有垃圾回收，它会回收资源的同时顺便关闭file。但是推荐写上
Python对文本文件会自动使用Unicode进行转码和编码，并且自动转换换行符
对文件的write需要自己格式化，它不会帮你调用str方法。

eval 函数可以将字符串当作python程序运行: eval("COMMAND")

9.3.3 Pickle

对象序列化模块

# 写入文件
>>> D = {'a': 1, 'b': 2}
>>> F = open('datafile.pkl', 'wb')
>>> import pickle
>>> pickle.dump(D, F) # Pickle any object to file
>>> F.close()在这里插入代码片

# 从文件读取
>>> F = open('datafile.pkl', 'rb')
>>> E = pickle.load(F) # Load any object from file
>>> E
{'a': 1, 'b': 2}

shelve模块是基于pickle，使用key来获取对象的序列化模块

9.3.4 json

json支持的对象类型不如pickle多

# 字符串
>>> import json
>>> S = json.dumps(rec)
>>> O = json.loads(S)

# 文件读写
>>> json.dump(rec, fp=open('testjson.txt', 'w'), indent=4)
>>> P = json.load(open('testjson.txt'))

9.3.5 struct

构造和解析二进制

>>> F = open('data.bin', 'wb') # Open binary output file
>>> import struct
>>> data = struct.pack('>i4sh', 7, b'spam', 8) # Make packed binary data
>>> data
b'\x00\x00\x00\x07spam\x00\x08'
>>> F.write(data) # Write byte string
>>> F.close()

9.3.6 其他file工具

标准流，比如 sys.out
os模块内的file descriptor
Sockets, pipes, and FIFOs
使用key来访问序列化对象的shelves
Shell command streams： subprocess.Popen,os.popen

9.4 核心Python类型回顾

Object type	Category	Mutable?
Numbers (all)	Numeric	No
Strings (all)	Sequence	No
Lists	Sequence	Yes
Dictionaries	Mapping	Yes
Tuples	Sequence	No
Files	Extension	N/A
Sets	Set	Yes
Frozenset	Set	No
bytearray	Sequence	Yes

9.4.1 引用复制问题

要注意引用的问题，如果为了防止被改变，要先copy。而切片和copy方法不会深度复制，所以要用copy.deepcopy方法

切片可以返回copy ：注意：只返回上层的copy，而不是深拷贝
字典、list、set有自己的copy方法，注意：也不是深拷贝
内置函数可以创建新对象：list(L), dict(D), set(S)
copy模块的方法:copy, deepcopy

9.4.2 比较相等

python的核心对象是使用的递归比较，从最上层开始，直到比较出结果：

>>> L1 = [1, ('a', 3)] # Same value, unique objects
>>> L2 = [1, ('a', 3)]
>>> L1 == L2, L1 is L2 # Equivalent? Same object?
(True, False)


# 小字符串会缓存
>>> S1 = 'spam'
>>> S2 = 'spam'
>>> S1 == S2, S1 is S2
(True, True)

# 长一点就不会了
>>> S1 = 'a longer string'
>>> S2 = 'a longer string'
>>> S1 == S2, S1 is S2
(True, False)

嵌套的比较：

>>> L1 = [1, ('a', 3)]
>>> L2 = [1, ('a', 2)]
>>> L1 < L2,      L1 == L2,      L1 > L2 # Less, equal, greater: tuple of results
(False, False, True)

Python核心类型比较方法

数字类型是经过转换成最大兼容类型后，按照数字大小比较
字符串是一个字符一个字符，根据编码数字大小比较（ord函数返回数字）
lits和tuple是从左到右一个一个比较元素，而且如果有嵌套，会递归进去比较
sets相等的情况是包含同样的元素
字典相等的情况是它们 sorted (key, value)全部一致
类型不同的对象不支持magnitude比较： 11 > "11" # 报错； 11 == "11" #False

9.5 Python的bool类型

Python中的每个对象内在都有True和False:

Numbers are false if zero, and true otherwise.
Other objects are false if empty, and true otherwise.

Object	Value
“spam”	True
“”	False
" "	True
[1, 2]	True
[]	False
{‘a’: 1}	True
{}	False
1	True
0.0	False
None	False

None不代表没有，或者未定义， None是一个真正的对象，一个真正的内存，是Python内置的一个特殊名字。

9.6 Type对象

类型本身是type类型, dict, list, str, tuple, int, float, complex, bytes, type, set, file 这些都是构造方法，不算是类型转换，但是可以看成是这样。

types模块还有更多关于type的tool。

type([1]) == type([])       # Compare to type of another list
type([1]) == list       # Compare to list type name
isinstance([1], list)      # Test if list or customization thereof
import types            # types has names for other types
def f(): pass

在这里插入图片描述 type(f) == types.FunctionType

9.7 其他注意点

Repetition Adds One Level Deep

>>> L = [4, 5, 6]
>>> X = L * 4 # Like [4, 5, 6] + [4, 5, 6] + ...
>>> Y = [L] * 4 # [L] + [L] + ... = [L, L,...]
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]]


# Y内的每个元素都是L的引用，所以它们是一个对象
>>> L[1] = 0 # Impacts Y but not X
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 0, 6], [4, 0, 6], [4, 0, 6], [4, 0, 6]]

Beware of Cyclic Data Structures 防止循环引用自己

Python会把循环引用，打印成[...]

>>> L = ['grail'] # Append reference to same object
>>> L.append(L) # Generates cycle in object: [...]
>>> L
['grail', [...]]