目录
重点:
- tuple的操作
collections
模块中有其他的多功能类型- 文件操作
- 文件本身是有
buffer
的,而且二进制操作可以寻址seek
- 文本文件最好的读取方式是直接迭代
- 文件的文本处理会自动使用
utf-8
转换编码而且对文本处理会自动使用utf-8
转换编码 - 对象拷贝的方法: 4种
- Python内置对象的比较规则
- Python内置对象的比较,是递归的
- Python对象都是有
True
和False
的 None
本身时一个特殊对象,是一个实际内存空间,Python内置的特殊的名称
Chap9 Tuples, Files, and Everything Else
9.1 Tuple
Operation | Interpretation |
---|---|
() | An empty tuple |
T = (0,) | A one-item tuple (not an expression) |
T = (0, ‘Ni’, 1.2, 3) | A four-item tuple |
T = 0, ‘Ni’, 1.2, 3 | Python允许省略括号 |
T = (‘Bob’, (‘dev’, ‘mgr’)) | Nested tuples |
T = tuple(‘spam’) | Tuple of items in an iterable |
T[i] | Index, index of index, slice, length |
T[i][j] | |
T[i:j] | |
len(T) | |
T1 + T2 | Concatenate, repeat |
T * 3 | |
for x in T: print(x) | Iteration |
‘spam’ in T | membership |
[x ** 2 for x in T] | |
T.index(‘Ni’) | Methods in 2.6, 2.7, and 3.X: search, count |
T.count(‘Ni’) | |
namedtuple(‘Emp’, [‘name’, ‘jobs’]) | Named tuple extension type |
9.2 Named Tuples
>>> from collections import namedtuple # Import extension type
>>> Rec = namedtuple('Rec', ['name', 'age', 'jobs']) # Make a generated class
>>> bob = Rec('Bob', age=40.5, jobs=['dev', 'mgr']) # A named-tuple record
>>> bob
Rec(name='Bob', age=40.5, jobs=['dev', 'mgr'])
>>> bob[0], bob[2] # Access by position
('Bob', ['dev', 'mgr'])
>>> bob.name, bob.jobs # Access by attribute
('Bob', ['dev', 'mgr'])
# 改变成 OrderedDict类型
>>> O = bob._asdict() # Dictionary-like form
>>> O['name'], O['jobs'] # Access by key too
('Bob', ['dev', 'mgr'])
>>> O
OrderedDict([('name', 'Bob'), ('age', 40.5), ('jobs', ['dev', 'mgr'])])
# Tuple Unpacking
name, age, jobs = bob
9.3 File
Operation | Interpretation |
---|---|
output = open(r’C:\spam’, ‘w’) | |
input = open(‘data’, ‘r’) | Create input file (‘r’ means read) |
input = open(‘data’) | Same as prior line (‘r’ is the default) |
aString = input.read() | Read entire file into a single string |
aString = input.read(N) | Read up to next N characters (or bytes) into a string |
aString = input.readline() | Read next line (including \n newline) into a string |
aList = input.readlines() | Read entire file into list of line strings (with \n) |
output.write(aString) | Write a string of characters (or bytes) into file |
output.writelines(aList) | Write all line strings in a list into file |
output.close() | Manual close (done for you when file is collected) |
output.flush() | Flush output buffer to disk without closing |
anyFile.seek(N) | Change file position to offset N for next operation |
for line in open(‘data’): | use line File iterators read line by line |
open(‘f.txt’, encoding=‘latin-1’) | Python 3.X Unicode text files (str strings) |
open(‘f.bin’, ‘rb’) | Python 3.X bytes files (bytes strings) |
codecs.open(‘f.txt’, encoding=‘utf8’) | Python 2.X Unicode text files (unicode strings) |
open(‘f.bin’, ‘rb’) | Python 2.X bytes files (str strings) |
file本身是个iterator
9.3.1 open函数
参数
open(filename, mode, [buffer, encoding] )
- filename 文件名
- mode:
2.1.w
:写
2.2.r
:读
2.3.a
:追加
2.4.b
:二进制
2.5.+
:可读可写,常常配合seek
- buffer: 0代表不buffer,直接写入disk
9.3.2 使用file
- 最优的读取方法是将file对象当作 iterator来遍历
- file是有缓存的
buffered
:flush()
,而且二进制的文件是可寻址的seekable
: seek() readline
方法当读到文件尾部的时候,会返回空字符串;区别于空行——空行会返回\n
write()
函数会返回写入的字符个数close
函数在CPython中是可选的,因为有垃圾回收,它会回收资源的同时顺便关闭file。但是推荐写上- Python对文本文件会自动使用
Unicode
进行转码和编码,并且自动转换换行符 - 对文件的write需要自己格式化,它不会帮你调用
str
方法。
eval 函数可以 将字符串当作python程序运行:
eval("COMMAND")
9.3.3 Pickle
对象序列化模块
# 写入文件
>>> D = {'a': 1, 'b': 2}
>>> F = open('datafile.pkl', 'wb')
>>> import pickle
>>> pickle.dump(D, F) # Pickle any object to file
>>> F.close()在这里插入代码片
# 从文件读取
>>> F = open('datafile.pkl', 'rb')
>>> E = pickle.load(F) # Load any object from file
>>> E
{'a': 1, 'b': 2}
shelve
模块是基于pickle
,使用key来获取对象的序列化模块
9.3.4 json
json支持的对象类型不如pickle多
# 字符串
>>> import json
>>> S = json.dumps(rec)
>>> O = json.loads(S)
# 文件读写
>>> json.dump(rec, fp=open('testjson.txt', 'w'), indent=4)
>>> P = json.load(open('testjson.txt'))
9.3.5 struct
构造和解析二进制
>>> F = open('data.bin', 'wb') # Open binary output file
>>> import struct
>>> data = struct.pack('>i4sh', 7, b'spam', 8) # Make packed binary data
>>> data
b'\x00\x00\x00\x07spam\x00\x08'
>>> F.write(data) # Write byte string
>>> F.close()
9.3.6 其他file工具
- 标准流,比如
sys.out
- os模块内的file descriptor
- Sockets, pipes, and FIFOs
- 使用key来访问序列化对象的
shelves
- Shell command streams:
subprocess.Popen
,os.popen
9.4 核心Python类型回顾
Object type | Category | Mutable? |
---|---|---|
Numbers (all) | Numeric | No |
Strings (all) | Sequence | No |
Lists | Sequence | Yes |
Dictionaries | Mapping | Yes |
Tuples | Sequence | No |
Files | Extension | N/A |
Sets | Set | Yes |
Frozenset | Set | No |
bytearray | Sequence | Yes |
9.4.1 引用复制问题
要注意引用的问题,如果为了防止被改变,要先copy。 而切片和copy
方法不会深度复制,所以要用copy.deepcopy
方法
- 切片可以返回copy : 注意:只返回上层的copy,而不是深拷贝
- 字典、list、set有自己的
copy
方法, 注意:也不是深拷贝 - 内置函数可以创建新对象:
list(L), dict(D), set(S)
copy
模块的方法:copy, deepcopy
9.4.2 比较相等
python的核心对象是使用的递归比较,从最上层开始,直到比较出结果:
>>> L1 = [1, ('a', 3)] # Same value, unique objects
>>> L2 = [1, ('a', 3)]
>>> L1 == L2, L1 is L2 # Equivalent? Same object?
(True, False)
# 小字符串会缓存
>>> S1 = 'spam'
>>> S2 = 'spam'
>>> S1 == S2, S1 is S2
(True, True)
# 长一点就不会了
>>> S1 = 'a longer string'
>>> S2 = 'a longer string'
>>> S1 == S2, S1 is S2
(True, False)
嵌套的比较:
>>> L1 = [1, ('a', 3)]
>>> L2 = [1, ('a', 2)]
>>> L1 < L2, L1 == L2, L1 > L2 # Less, equal, greater: tuple of results
(False, False, True)
Python核心类型比较方法
- 数字类型是经过转换成最大兼容类型后,按照数字大小比较
- 字符串是一个字符一个字符,根据编码数字大小比较(
ord
函数返回数字) lits和tuple
是从左到右一个一个比较元素,而且如果有嵌套,会递归进去比较sets
相等的情况是包含同样的元素- 字典相等的情况是它们
sorted (key, value)
全部一致 - 类型不同的对象不支持
magnitude
比较:11 > "11" # 报错; 11 == "11" #False
9.5 Python的bool类型
Python中的每个对象内在都有True和False:
- Numbers are false if zero, and true otherwise.
- Other objects are false if empty, and true otherwise.
Object | Value |
---|---|
“spam” | True |
“” | False |
" " | True |
[1, 2] | True |
[] | False |
{‘a’: 1} | True |
{} | False |
1 | True |
0.0 | False |
None | False |
None
不代表没有,或者未定义, None
是一个真正的对象,一个真正的内存,是Python内置的一个特殊名字。
9.6 Type对象
类型本身是type
类型, dict, list, str, tuple, int, float, complex, bytes, type, set, file
这些都是构造方法,不算是类型转换,但是可以看成是这样。
types
模块还有更多关于type
的tool。
type([1]) == type([]) # Compare to type of another list
type([1]) == list # Compare to list type name
isinstance([1], list) # Test if list or customization thereof
import types # types has names for other types
def f(): pass
type(f) == types.FunctionType
9.7 其他注意点
- Repetition Adds One Level Deep
>>> L = [4, 5, 6]
>>> X = L * 4 # Like [4, 5, 6] + [4, 5, 6] + ...
>>> Y = [L] * 4 # [L] + [L] + ... = [L, L,...]
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 5, 6], [4, 5, 6], [4, 5, 6], [4, 5, 6]]
# Y内的每个元素都是L的引用,所以它们是一个对象
>>> L[1] = 0 # Impacts Y but not X
>>> X
[4, 5, 6, 4, 5, 6, 4, 5, 6, 4, 5, 6]
>>> Y
[[4, 0, 6], [4, 0, 6], [4, 0, 6], [4, 0, 6]]
- Beware of Cyclic Data Structures 防止循环引用自己
Python会把循环引用,打印成[...]
>>> L = ['grail'] # Append reference to same object
>>> L.append(L) # Generates cycle in object: [...]
>>> L
['grail', [...]]