希望通过requests获取到xls,再通过pandas.read_excel转换为dataframe
>>> res = requests.get(url)
>>> df = pd.read_excel(result.content)
以上代码在本机上运行没有问题,但是,通过docker部署到服务器时报错:
File "/usr/lib/python3/dist-packages/pandas/util/_decorators.py", line 208, in wrapper
return func(*args, **kwargs)
File "/usr/lib/python3/dist-packages/pandas/io/excel/_base.py", line 310, in read_excel
io = ExcelFile(io, engine=engine)
File "/usr/lib/python3/dist-packages/pandas/io/excel/_base.py", line 819, in __init__
self._reader = self._engines[engine](self._io)
File "/usr/lib/python3/dist-packages/pandas/io/excel/_xlrd.py", line 21, in __init__
super().__init__(filepath_or_buffer)
File "/usr/lib/python3/dist-packages/pandas/io/excel/_base.py", line 361, in __init__
raise ValueError(
ValueError: Must explicitly set engine if not passing in buffer or path for io.
设置engine后依旧报错
>>> df = pd.read_excel(result.content, engine='xlrd')
搜索后发现解决办法:
>>> BytesIO = pd.io.common.BytesIO
>>> df = pd.read_excel(BytesIO(result.content))
具体是为什么,还不清楚
参考:https://stackoverflow.com/a/50307531/7151777