今天在解析xml文件的时候出现了下面的问题:
File "make_train_data.py", line 40, in <module>
pair_list = parse_train_data("/media/data/biendata/baifendian_data/train_set.xml")
File "make_train_data.py", line 11, in parse_train_data
doc = parse(xml_data)
File "/home/eric/anaconda3/lib/python3.6/xml/dom/minidom.py", line 1958, in parse
return expatbuilder.parse(file)
File "/home/eric/anaconda3/lib/python3.6/xml/dom/expatbuilder.py", line 911, in parse
result = builder.parseFile(fp)
File "/home/eric/anaconda3/lib/python3.6/xml/dom/expatbuilder.py", line 207, in parseFile
parser.Parse(buffer, 0)
xml.parsers.expat.ExpatError: not well-formed (invalid token): line 5, column 14
解决方法
<?xml version="1.0" encoding="utf-8"?>
即在xml文件的第一行插入这个,如果已经有,则替换一下。