python处理windows文本报错:UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4

当在Linux环境下使用Python处理从Windows生成的TXT文档时,遇到`UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4`错误。问题根源在于编码不匹配。通过在代码中指定正确的编码方式,如UTF-8,可以解决这个问题。此场景中,代码使用了xml.etree.ElementTree库进行文本格式转换,以生成YOLO训练集的XML格式。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

在windows下生成一份txt文档,在linux下用python做格式化处理,报错

Traceback (most recent call last):
  File "txt2xml.py", line 55, in <module>
    tree.write("1.xml")
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 820, in write
    serialize(write, self._root, encoding, qnames, namespaces)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml
    _serialize_xml(write, e, encoding, qnames, None)
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 937, in _serialize_xml
    write(_escape_cdata(text, encoding))
  File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1073, in _escape_cdata
    return text.encode(encoding, "xmlcharrefreplace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 42: ordinal not in range(128)

查资料说是编码不匹配,添加了编码设置。该代码块为文本格式转化,得到yolo训练集合的样式。
代码使用xml.etree.ElementTree ,具体代码如下:

[root@localhost tools]# vi txt2xml.py

#encoding=utf-8  
from xml.etree import ElementTree as ET  
import sys  
#print sys.getdefaultencoding()
#在这里重新设置编码
reload(sys)
sys.setdefaultencoding('utf-8')


f = open("/data/1xiu/darknet/carData/car.list1")
for line in f:
    line = line.decode("gbk").encode("utf-8").strip()
    lineArray = line.split(" ")

    root=ET.Element('annotation')  

    folder = ET.SubElement(root, "folder")
    folder.text = "general_text"

    filename = ET.SubElement(root, "filename")
    filename.text=lineArray[0].decode("gbk").encode("utf8")

    size = ET.SubElement(root, "size")

    width = ET.SubElement(size, "width")
    width.text = lineArray[2]

    height = ET.SubElement(size,"height")
    height.text = lineArray[1]

    depth = ET.SubElement(size, "depth")
    depth.text = str(3)

    obj = ET.SubElement(root, "object")

    name = ET.SubElement(obj, "name")
    name.text = "plate"


    bnd = ET.SubElement(obj, "bndbox")

    xmin = ET.SubElement(bnd, "xmin")
    xmin.text = lineArray[3]

    ymin = ET.SubElement(bnd, "ymin")
    ymin.text = lineArray[4]

    xmax = ET.SubElement(bnd, "xmax")
    xmax.text = lineArray[3] + lineArray[5]

    ymax = ET.SubElement(bnd, "ymax")
    ymax.text = lineArray[4] + lineArray[6]

    tree=ET.ElementTree(root)
    #tree.write(lineArray[0].decode("gbk").encode("utf8")+".xml")
    tar = lineArray[0] + ".xml"
    tree.write("1.xml")
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值