版权声明
请尊重原创作品。转载请保持文章完整性,并以超链接形式注明原始作者“tingsking18”和主站点地址,方便其他朋友提问和指正。
我们在用正则表达式分析html或者是xml的时候,会碰上要匹配的目标字符串含有回车换行的情况,这时候我们就不能试用
.*?来匹配了(除非你先把字符串中的回车换行去掉。)我们应该试用\s\S来匹配。
代码如下:
#!/usr/bin/env python #coding=utf-8 import re test = """ <target name="reviewcode" description="Review code using PMD"> <taskdef name="pmd" classname="net.sourceforge.pmd.ant.PMDTask" classpath="${lib.dir}/pmd-3.8.jar" /> <pmd shortFilenames="true"> <!-- Determine the ruleset to be used --> <ruleset>rulesets/favorites.xml</ruleset> <ruleset>basic</ruleset> <ruleset> asdaas abcde </ruleset> <!-- Generate and HTML report into the designated directory --> <formatter type="html" toFile="${report.dir}/pmd_automated_code_review_report.html" /> <!-- Files to be configured for review --> <fileset dir="${workspace.dir}/"> <!-- Include all .java files except those under directories that are automatically generated --> <include name="**/*.java" /> <!-- A sample exlusion directory that has generated java source code --> <exclude name="**/generated/**/*.java" /> </fileset> </pmd> </target> """ if __name__ == '__main__': theDates = re.findall('''<ruleset>([\s\S]*?)</ruleset>''', test) print theDates
输出结果:
> "D:\Python25\python.exe" -u "C:\test.py"
['rulesets/favorites.xml', 'basic', '\n \n asdaas\n \n abcde\n ']