本文翻译自:Reading entire file in Python
If you read an entire file with content = open('Path/to/file', 'r').read() is the file handle left open until the script exits? 如果你读取整个文件,其中content = open('Path/to/file', 'r').read()是文件句柄保持打开状态,直到脚本退出? Is there a more concise method to read a whole file? 是否有更简洁的方法来读取整个文件?
#1楼
参考:https://stackoom.com/question/V5ca/用Python读取整个文件
#2楼
You can use pathlib . 您可以使用pathlib 。
For Python 3.5 and above: 对于Python 3.5及更高版本:
from pathlib import Path
contents = Path(file_path).read_text()
For lower versions of Python use pathlib2 : 对于较低版本的Python,请使用pathlib2 :
$ pip install pathlib2
Then: 然后:
from pathlib2 import Path
contents = Path(file_path).read_text()
This is the actual read_text implementation : 这是实际的read_text 实现 :
def read_text(self, encoding=None, errors=None):
"""
Open the file in text mode, read it, and close the file.
"""
with self.open(mode='r', encoding=encoding, errors=errors) as f:
return f.read()
#3楼
The answer to that question depends somewhat on the particular Python implementation. 该问题的答案在某种程度上取决于特定的Python实现。
To understand what this is all about, pay particular attention to the actual file object. 要了解这是什么,请特别注意实际的file对象。 In your code, that object is mentioned only once, in an expression, and becomes inaccessible immediately after the read() call returns. 在您的代码中,该对象在表达式中仅被提及一次,并且在read()调用返回后立即变得不可访问。
This means that the file object is garbage. 这意味着文件对象是垃圾。 The only remaining question is "When will the garbage collector collect the file object?". 唯一剩下的问题是“垃圾收集器什么时候收集文件对象?”。
in CPython, which uses a reference counter, this kind of garbage is noticed immediately, and so it will be collected immediately. 在使用引用计数器的CPython中,会立即注意到这种垃圾,因此会立即收集它。 This is not generally true of other python implementations. 对于其他python实现,这通常不正确。
A better solution, to make sure that the file is closed, is this pattern: 为确保文件已关闭,更好的解决方案是这种模式:
with open('Path/to/file', 'r') as content_file:
content = content_file.read()
which will always close the file immediately after the block ends; 在块结束后将立即关闭文件; even if an exception occurs. 即使发生异常。
Edit: To put a finer point on it: 编辑:为它添加一个更好的点:
Other than file.__exit__() , which is "automatically" called in a with context manager setting, the only other way that file.close() is automatically called (that is, other than explicitly calling it yourself,) is via file.__del__() . 除了file.__exit__() ,它在一个with上下文管理器设置的情况下“自动”调用, file.close()被自动调用的唯一另一种方式(也就是说,除了自己显式调用它之外)是通过file.__del__() 。 This leads us to the question of when does __del__() get called? 这引出了我们__del__()何时被调用的问题?
A correctly-written program cannot assume that finalizers will ever run at any point prior to program termination. 正确编写的程序不能假定终结器将在程序终止之前的任何时刻运行。
-- https://devblogs.microsoft.com/oldnewthing/20100809-00/?p=13203 - https://devblogs.microsoft.com/oldnewthing/20100809-00/?p=13203
In particular: 特别是:
Objects are never explicitly destroyed; 对象永远不会被明确销毁; however, when they become unreachable they may be garbage-collected. 然而,当它们变得无法到达时,它们可能被垃圾收集。 An implementation is allowed to postpone garbage collection or omit it altogether — it is a matter of implementation quality how garbage collection is implemented, as long as no objects are collected that are still reachable. 允许实现推迟垃圾收集或完全省略垃圾收集 - 只要没有收集到仍然可以访问的对象,实现垃圾收集的实现方式就是如此。
[...] [...]
CPython currently uses a reference-counting scheme with (optional) delayed detection of cyclically linked garbage, which collects most objects as soon as they become unreachable, but is not guaranteed to collect garbage containing circular references. CPython目前使用带有(可选)延迟检测循环链接垃圾的引用计数方案,它会在大多数对象无法访问时立即收集,但不保证收集包含循环引用的垃圾。
-- https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types - https://docs.python.org/3.5/reference/datamodel.html#objects-values-and-types
(Emphasis mine) (强调我的)
but as it suggests, other implementations may have other behavior. 但正如它所暗示的那样,其他实现可能还有其他行为。 As an example, PyPy has 6 different garbage collection implementations ! 举个例子,PyPy 有6个不同的垃圾收集实现 !
本文讨论了如何在Python中优雅地读取整个文件,包括使用pathlib和上下文管理器确保文件关闭。在CPython中,文件对象在不再引用时会被立即回收,但在其他Python实现中可能有所不同。推荐使用`with`语句确保文件始终在使用后关闭。
3869

被折叠的 条评论
为什么被折叠?



