pandas 输出 excel 文件流 StringIO or BytesIO

最新推荐文章于 2023-08-19 16:32:58 发布

原创最新推荐文章于 2023-08-19 16:32:58 发布 · 4.7k 阅读

3 ·

CC 4.0 BY-SA版权

python 同时被 3 个专栏收录

29 篇文章

订阅专栏

django

12 篇文章

订阅专栏

pandas

3 篇文章

订阅专栏

本文介绍如何使用Pandas库将DataFrame数据写入内存中的Excel文件，通过设置engine参数选择Excel格式，利用BytesIO处理二进制数据，并通过HttpResponse正确返回文件流，适用于Django REST框架。

部署运行你感兴趣的模型镜像

# Writing Excel files to memory

# Pandas supports writing Excel files to buffer-like objects such as StringIO or BytesIO using ExcelWriter.

# Safe import for either Python 2.x or 3.x
try:
    from io import BytesIO
except ImportError:
    from cStringIO import StringIO as BytesIO

bio = BytesIO()

# By setting the 'engine' in the ExcelWriter constructor.
writer = pd.ExcelWriter(bio, engine='xlsxwriter')
# 这里df就是你数据的pandas.DataFrame的数据
df.to_excel(writer, sheet_name='Sheet1')

# Save the workbook
writer.save()

# Seek to the beginning and read to copy the workbook to a variable in memory
bio.seek(0)
workbook = bio.read()

"""engine is optional but recommended. Setting the engine determines the version of
 workbook produced. Setting engine='xlrd' will produce an Excel 2003-format workbook
  (xls). Using either 'openpyxl' or 'xlsxwriter' will produce an Excel 2007-format
   workbook (xlsx). If omitted, an Excel 2007-formatted workbook is produced. """

上面摘自：pandas官方文档-writing-excel-files-to-memory

根据官方文档就可以直接得到一个文件流在内存中，然后只需要返回Response就好了。
我这里使用的是Django rest framework 框架。

非常坑的一点是，我之前一直使用的rest_framework.response.Response，直接将文件流传过去会一直报错，提示utf-8编码的问题，但是其实并不是编码的问题，而是Response接收的是data参数，而我们返回的是二进制的数据。所以，我们应该使用django.http.HttpResponse这个，他接收的是content，一个二进制的bytestring。
所以HttpResponse(workbook, content_type=content_type)这样就可以了。

然后就是设置content-type，这样浏览器就知道我们是以什么形式给他的。
我们office excel 2007的content-type是application/vnd.openxmlformats-officedocument.spreadsheetml.sheet
具体可以参考 HTTP content-type 对照表，当然没有的话可以自己搜索。

还有一点就是Content-Disposition参数：
attachment — 作为附件下载
inline — 在线打开
具体使用如：
response['Content-Disposition'] = 'attachment;filename=文件名.txt'
response['Content-Disposition'] = 'inline;filename=文件名.txt'
这里还有一个坑就是如果你把文件名当做参数传进去，使用 % 或者 format 来格式化。一旦这个文件名中间包括空格，在http响应的时候就是Content-Disposition:attachment;filename=i love you.txt 然后系统会自动将这个空格将后面的舍弃掉。这样你再拿到这个文件的时候就只有第一个空格之前的filename，没有后缀名，都无法直接打开。
可以的办法就是使用转义符，使用双引号Content-Disposition:attachment;filename=\"i love you.txt\"，这样你就可以将文件名使用格式化放进去了。