python 中文unicode编码

最新推荐文章于 2024-08-07 22:32:38 发布

原创最新推荐文章于 2024-08-07 22:32:38 发布 · 1.2w 阅读

8 ·

CC 4.0 BY-SA版权

python 专栏收录该内容

40 篇文章

订阅专栏

博客主要围绕Excel写入中文报错展开，介绍向Excel追加内容和新建Excel写入中文时出现UnicodeDecodeError的解决办法，还阐述了unicode与utf - 8编码，以及unicode与普通string字符串相互转换的方法，最后讲解了str.rsplit()函数的使用。

一、 excel中写入中文报错UnicodeDecodeError : ‘ascii’ codec can’t decode byte 0xe5 in position 0: ordinal not in range(128)

1.向excel中追加内容

解决方法：

第一行加入 # -*- coding: utf-8 -*-，
将中文按“Unicode”字符编码方式进行写入。

代码如下：

# -*- coding: utf-8 -*-
import xlwt
name = u'小明'
excel = xlwt.Workbook()
sheet1 = excel.add_sheet(u'sheet1', cell_overwrite_ok=True)
sheet1.write(0, 0, u'你好')
sheet1.write(0, 1, u'hello, ' + name)
excel.save('save_filepath')

输出结果
插入工作表‘sheet1’：u'sheet1'
写入中文：u'你好'，u'hello, ' + name

2.新建excel写入中文

# -*- coding: utf-8 -*-
from xlwt import Workbook

book =Workbook(encoding='utf-8')
sheet1 =book.add_sheet('list')

二、unicode 与utf-8编码

unicode->str :a.encode("utf-8")
str->unicode: unicode(b, "utf-8") 或 b.decode("utf-8")

代码如下：

a = u'你好'
b = a.encode("utf-8")
c = unicode(b, "utf-8")
print a, type(a)
print b, type(b)
print c, type(c)
print c + u"hahaha", type(c + u"hahaha")

输出：
你好 <type ‘unicode’>
你好 <type ‘str’>
你好 <type ‘unicode’>
你好hahaha <type ‘unicode’>

s = u'\u4eba\u751f\u82e6\u77ed\uff0cpy\u662f\u5cb8'
print s

输出：人生苦短，py是岸

三、unicode与普通string字符串相互转换

unicodestring = u"Hello world"

1.将Unicode转化为普通Python字符串：“encode” 编码

utf8string = unicodestring.encode("utf-8")  
asciistring = unicodestring.encode("ascii")

2.将普通Python字符串转化为Unicode：“decode” 解码

plainstring1 = unicode(utf8string, "utf-8")  
plainstring2 = unicode(asciistring, "ascii")

四、str.rsplit（）

str.rsplit(’.’, num1)[num2]，其中num1为要将str从后向前分为num1+1部分（列表形式），num2为取第num2+1部分。

a = '20181115 15:33:58:193804_保修卡CYWK-6016S（sharp)(英文)2360.pdf.pdf'
print a.rsplit('.', 3)[0]

输出：20181115 15:33:58:193804_保修卡CYWK-6016S（sharp)(英文)2360