输出图片标注xml文件的二级标签
folder
filename
path
source
size
segmented
object
import xml.dom.minidom as xmldom
import sys
from lxml import etree
import tensorflow as tf
sys.setrecursionlimit(1000000)
def recursive_parse_xml_to_dict(xml):
if not xml:
return {xml.tag: xml.text}
result = {}
for child in xml:
child_result = recursive_parse_xml_to_dict(child)
if child.tag != 'object':
result[child.tag] = child_result[child.tag]
else:
if child.tag not in result:
result[child.tag] = []
result[child.tag].append(child_result[child.tag])
return {xml.tag: result}
with tf.io.gfile.GFile("000001.xml", 'r') as fid:
xml_str = fid.read()
xml = etree.fromstring(xml_str)
dict0 = recursive_parse_xml_to_dict(xml)['annotation']
for i in dict0:
print(i)
本文介绍了一个用于解析XML格式图片标注文件的Python函数。通过递归方式将XML结构转换为易于操作的字典形式,便于进一步的数据处理和分析。重点讲解了如何处理XML中的二级标签,如folder, filename, path等。
1406

被折叠的 条评论
为什么被折叠?



