在windows下生成一份txt文档,在linux下用python做格式化处理,报错
Traceback (most recent call last):
File "txt2xml.py", line 55, in <module>
tree.write("1.xml")
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 820, in write
serialize(write, self._root, encoding, qnames, namespaces)
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 939, in _serialize_xml
_serialize_xml(write, e, encoding, qnames, None)
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 937, in _serialize_xml
write(_escape_cdata(text, encoding))
File "/usr/lib64/python2.7/xml/etree/ElementTree.py", line 1073, in _escape_cdata
return text.encode(encoding, "xmlcharrefreplace")
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 42: ordinal not in range(128)
查资料说是编码不匹配,添加了编码设置。该代码块为文本格式转化,得到yolo训练集合的样式。
代码使用xml.etree.ElementTree ,具体代码如下:
[root@localhost tools]# vi txt2xml.py
#encoding=utf-8
from xml.etree import ElementTree as ET
import sys
#print sys.getdefaultencoding()
#在这里重新设置编码
reload(sys)
sys.setdefaultencoding('utf-8')
f = open("/data/1xiu/darknet/carData/car.list1")
for line in f:
line = line.decode("gbk").encode("utf-8").strip()
lineArray = line.split(" ")
root=ET.Element('annotation')
folder = ET.SubElement(root, "folder")
folder.text = "general_text"
filename = ET.SubElement(root, "filename")
filename.text=lineArray[0].decode("gbk").encode("utf8")
size = ET.SubElement(root, "size")
width = ET.SubElement(size, "width")
width.text = lineArray[2]
height = ET.SubElement(size,"height")
height.text = lineArray[1]
depth = ET.SubElement(size, "depth")
depth.text = str(3)
obj = ET.SubElement(root, "object")
name = ET.SubElement(obj, "name")
name.text = "plate"
bnd = ET.SubElement(obj, "bndbox")
xmin = ET.SubElement(bnd, "xmin")
xmin.text = lineArray[3]
ymin = ET.SubElement(bnd, "ymin")
ymin.text = lineArray[4]
xmax = ET.SubElement(bnd, "xmax")
xmax.text = lineArray[3] + lineArray[5]
ymax = ET.SubElement(bnd, "ymax")
ymax.text = lineArray[4] + lineArray[6]
tree=ET.ElementTree(root)
#tree.write(lineArray[0].decode("gbk").encode("utf8")+".xml")
tar = lineArray[0] + ".xml"
tree.write("1.xml")