读取.xls文件报错：XLRDError: Unsupported format, or corrupt file: Expected BOF record； found b‘MIME-Ver

最新推荐文章于 2023-10-20 09:04:59 发布

啾啾啾666

最新推荐文章于 2023-10-20 09:04:59 发布

阅读量4.9k

点赞数 2

分类专栏： python 机器学习文章标签： python

本文链接：https://blog.youkuaiyun.com/weixin_42052249/article/details/124524628

版权

python 同时被 2 个专栏收录

9 篇文章

订阅专栏

机器学习

5 篇文章

订阅专栏

解决方式：

1、将xls文件后缀转换为csv文件，再进行读取。

这种方式虽然读取后程序不再报错，但文件格式会发生改变，影响操作。

2、将xls文件另存为xlsx文件

如果读取的文件不多，对于每个文件，以次打开，另存为Microsoft Excel 97-2003文件（*.xls） ,再对另存的.xls文件进行读取即可

如果读取的文件有很多个，需要批量转换为xlsx文件。批量转换方法：

#encoding: utf-8
from ctypes import *
import time
import win32com.client as win32
import os
def transform(parent_path,out_path):
    fileList = os.listdir(parent_path)  #文件夹下面所有的文件
    num = len(fileList)
    for i in range(num):
        file_Name = os.path.splitext(fileList[i])   #文件和格式分开
        print(file_Name)
        if file_Name[1] == '.xls':
            print(file_Name[0])
            transfile1 = parent_path+'/'+fileList[i]  #要转换的excel
            print(transfile1)
            transfile2 = out_path+'/'+file_Name[0]    #转换出来excel
            excel=win32.gencache.EnsureDispatch('excel.application')
            pro=excel.Workbooks.Open(transfile1)   #打开要转换的excel
            pro.SaveAs(transfile2 + ".xlsx", FileFormat=51)  # 另存为xlsx格式
            pro.Close()
            excel.Application.Quit()

if __name__=='__main__':
# 注意：这里要写绝对路径。否则会报错：pywintypes.com_error: (-2147352567, '发生意外。', (0, 'Microsoft Excel', '抱歉，无法找到 *.xlsx。是否可能被移动、重命名或删除?', 'xlmain11.chm', 0, -2146827284), None)
    path1=r"E:\practice\jupyternotebook\DaDong\dataset/2021年1月大东厂进水口"  #待转换文件所在目录
    path2=r"E:\practice\jupyternotebook\DaDong\dataset/111"  #转换文件存放目录
    transform(path1, path2)

转换后再读取.xlsx文件就可以了