因为多是以来一直使用f.readlines()来处理文件,今天看了一篇帖子说,f.read()的效率要高于f.readlines()而且f.read()还更智能。当然,“智能”这个词到底体现在哪我还不知道,但是效率觉得值得商榷,然后自己写个小程序测试下。有点坑了,所以嘛,有歧义的话,还是自己动手去验证吧。。。。
测试文件大小为5M,近19万行。
import time
import codecs
import os
def read_95K():
path = r"E:\SVN\chocolate_ime\doc"
filename = os.path.join(path,"Cizu_komoxo95K.txt")
with codecs.open(filename,encoding="gbk") as f:
for line in f.readlines():
if line.startswith(";"):
pass
else:
splited_line = line.split("\t")
word = splited_line[0]
pinyin = splited_line[1]
freq = splited_line[2]
def read_95K_seldom():
path = r"E:\SVN\chocolate_ime\doc"
filename = os.path.join(path,"Cizu_komoxo95K.txt")
with codecs.open(filename,encoding="gbk") as f:
for line in f:
if line.startswith(";"):
pass
else:
splited_line = line.split("\t")
word = splited_line[0]
pinyin = splited_line[1]
freq = splited_line[2]
def time_check():
start_time = time.time()
# read_95K()#0.203999996185
read_95K_seldom()#0.425999879837
end_time = time.time()
print end_time - start_time
time_check()
f.read()平均时间:0.425999879837
f.readlines()平均时间:0.203999996185
时间对比,使用f.readlines()效率要高出f.read()一倍