首先看FileProxyMixin的定义:
class FileProxyMixin(object):
encoding = property(lambda self: self.file.encoding)
fileno = property(lambda self: self.file.fileno)
flush = property(lambda self: self.file.flush)
isatty = property(lambda self: self.file.isatty)
newlines = property(lambda self: self.file.newlines)
read = property(lambda self: self.file.read)
readinto = property(lambda self: self.file.readinto)
readline = property(lambda self: self.file.readline)
readlines = property(lambda self: self.file.readlines)
seek = property(lambda self: self.file.seek)
softspace = property(lambda self: self.file.softspace)
tell = property(lambda self: self.file.tell)
truncate = property(lambda self: self.file.truncate)
write = property(lambda self: self.file.write)
writelines = property(lambda self: self.file.writelines)
xreadlines = property(lambda self: self.file.xreadlines)
def __iter__(self):
return iter(self.file)
可以看到使用property属性装饰器和lambda函数,完成对self.file大部分操作的代理,并且也实现__iter__方法,支持迭代。
然后看File的定义,它的作用主要是完成对不同的类型的文件的抽象。增加chunks方法,用于分片读取数据大的文件,增加closed的属性,判断是否文件关闭, 增加size属性。
首先看size属性的实现函数:
def _get_size_from_underlying_file(self):
if hasattr(self.file, 'size'):
return self.file.size
if hasattr(self.file, 'name'):
try:
return os.path.getsize(self.file.name)
except (OSError, TypeError):
pass
if hasattr(self.file, 'tell') and hasattr(self.file, 'seek'):
pos = self.file.tell()
self.file.seek(0, os.SEEK_END)
size = self.file.tell()
self.file.seek(pos)
return size
raise AttributeError("Unable to determine the file's size.")
首先尝试获取self.file的size属性, 然后尝试获取本地文件的大小,最后使用seek方法获取大小。
self.file.seek(0, os.SEEK_END)
size = self.file.tell()
接下来看看File如何支持迭代的:
def __iter__(self):
# Iterate over this file-like object by newlines
buffer_ = None
for chunk in self.chunks():
for line in chunk.splitlines(True):
if buffer_:
if endswith_cr(buffer_) and not equals_lf(line):
# Line split after a \r newline; yield buffer_.
yield buffer_
# Continue with line.
else:
# Line either split without a newline (line
# continues after buffer_) or with \r\n
# newline (line == b'\n').
line = buffer_ + line
# buffer_ handled, clear it.
buffer_ = None
# If this is the end of a \n or \r\n line, yield.
if endswith_lf(line):
yield line
else:
buffer_ = line
if buffer_ is not None:
yield buffer_
这里迭代的逻辑,主要是buffer_的使用。
首先会调用chunks进行块读取,然后将line进行分行(splitlines)。将每段line剩下的字节,保存在buffer_中,等待下段一起合并。