python读写文件的各种模式
经常看到Python在读写文件时有一些模式,总是感觉记不住,丈二和尚摸不着头脑,所以特来总结一下。
参考 [1] https://docs.python.org/2/tutorial/inputoutput.html#reading-and-writing-files
Python中打开文件使用的函数是open().
open() returns a file object, and is most commonly used with 2 arguments: open(filename, mode)
>>> f = open('workfile', 'w')
>>> print f
<open file 'workfile', mode 'w' at 80a0960>
第一个参数是一个包含文件名的字符串。
第二个参数也是一个字符串,这个字符串描述了文件会被使用的方式。
常用的模式有以下几种
- ‘r’
when the file only can be read. - ‘w’
when the file only for writing(an existing file with the same name will be erased) - ‘a’
‘a’ opens the file for appending; any data written to the file is automatically added to the end. - ‘r+’
‘r+’ opens the file for both reading and writing.
The mode argument is optional; ‘r’ will be assumed if the mode argument is omitted
Python on windows makes a distinction between text and binary files;
在windows系统中,‘b’ appended to the mode opens the file in binary mode, so there are also modes like ‘rb’,‘wb’,and ‘r+b’. The end-of-line characters in text files are automatically altered slightly when data is read and written. This behind-the-scenes modification to the file data is fine for ASCII text files, but it’ll corrupt binary data like that in JPEG
or EXE
files.
所以在windows系统中使用python打开二进制文件时,一定要采用"带b“模式。
linux系统中的Python不区分二进制文件和文本文件。所以在linux系统中使用Python打开二进制文件时,不需要使用”带b”模式。但是建议在linux系统上写Python代码时,你最好也加上’b’。因为这样你的代码在windows上时依然可以正确打开二进制文件,即可以保证代码的平台无关性。
总结
文件分为二进制文件和文本文件。
二进制文件,例如exe文件,jpeg文件
文本文件,例如txt文件
windows系统中的python会区别对待二进制文件和文本文件。打开二进制文件时,一定要使用"带b“模式。否则可能会出现问题。
在window系统中
对文本文件,我们就使用普通模式。
对二进制文件,我们就使用带b的模式。
linux系统中的Python对二进制文件和文本文件没有区分,所以我们在打开二进制文件时,不需要使用”带b“模式,但是我们建议也使用“带b"模式,这样可以保证代码的平台无关性。
附录
DOS vs. Unix Line Endings
DOS/Windows上创建的文本文件和Unix/Linux上创建的文本文件有着不同的line endings.
DOS uses carriage return and line feed, that is “\r\n”, as a line ending.
Unix just uses line feed. “\n”.
mac平台暂时不讨论。
二进制文件和文本文件的区别
参考 http://www.cburch.com/csbsju/cs/160/notes/34/0.html
text file
A text file is a file that is properly understood as a sequence of character data (represented using ASCII, Unicode, or some other standard), separated into lines.
Typically, when a text file is displayed as a sequence of characters, it is easily human-readable.
binary file
A binary file is anything else. A binary file will include some data that is not written using a character-encoding standard - typically, some number would be represented using binary within the file, instead of using the character representation of its various digits (in some base).