当数据文件过大时,由于计算机内存有限,需要对大文件进行分块读取:
import pandas as pd
f = open('E:/学习相关/Python/数据样例/用户侧数据/test数据.csv')
reader = pd.read_csv(f, sep=',', iterator=True)
loop = True
chunkSize = 100000
chunks = []
while loop:
try:
chunk = reader.get_chunk(chunkSize)
chunks.append(chunk)
except StopIteration:
loop = False
print("Iteration is stopped.")
df = pd.concat(chunks, ignore_index=True)
print(df)
read_csv()函数的iterator参数等于True时,表示返回一个TextParser以便逐块读取文件;</