问题描述:在用python中的json包解析json字符串时,若遇到字段值为中文,直接print在屏幕上没问题,但是重定向到文件或者写文件时,出UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)。
json文件内容:
<pre name="code" class="python">jsonstr.txt:
{"QN48":"tc_39282f82912aa6cc_14ef7fe562a_6f5d","city":"沧州","firstCategory":null,"secondCategor":null,"url":"http://touch.piao.qunar.com/touch/detail.htm?id=8928&in_track=t_qudaoyy_tuijian_menpiao&bd_source=xiaomiliulanqi"}
{"QN48":"tc_038c9119e228c705_14dbc12f7e4_dbe4","city":"北京","firstCategory":"亲子","secondCategor":"动植物园","url":null}
{"QN48":"tc_48a929ab20cade19_147484fd593_f7ba","city":"广州","firstCategory":"亲子","secondCategor":"动植物园","url":null}
{"QN48":"tc_884a929ab20cade19_147484fd593_f7ba","city":"","firstCategory":"亲子","secondCategor":"动植物园","url":""}jsontest.py:<pre name="code" class="python">#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
with open('jsonstr.txt','rt') as f:
for line in f:
obj = json.loads(line)
city=obj['city']
print city,type(city)<span style="font-family: Arial, Helvetica, sans-serif; background-color: rgb(255, 255, 255);">运行,屏幕显示:</span>
在python中写文件也出一样的错。
问题原因:
loads方法返回了原始的对象,但是仍然发生了一些数据类型的转化(见下图)。json中的string类型到了python中成了unicode类型。

python在写文件时试图将unicode的字符串转成ascii码字符,导致出错,而英文字符串不出错。
问题解决:
将unicode类型转换成string类型的字符串,但是却不能直接用str()函数,用str()还是会出一样的错。这里需要将字符串的编码转换一下,在unicode字符串后面encode("utf-8"):
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import json
with open('jsonstr.txt','rt') as f:
for line in f:
obj = json.loads(line)
city=obj['city'].encode("utf-8")
print city,type(city)
结果如下:
这样就能顺利写文件了。
参考:
http://www.cnblogs.com/coser/archive/2011/12/14/2287739.html
1791

被折叠的 条评论
为什么被折叠?



