最近需要读取一堆XML文件的数据,这批XML文件很大,虽说用Excel读取很方便,但是读取速度特别慢,故使用Python结合pandas、xml、multiprocessing包实现了数据的快速读取。
# xml2xlsx
import xml.dom.minidom
import pandas as pd
import os
from multiprocessing import Pool
# Read XML file and convert it to XLSX
# Use multiprocess
def xml2excel(filename):
print(filename + '...')
save_path = '/home/pc/xadf'
xml_report = os.path.join(output_path, filename)
outputname