sample_text = '''
The textwrap module can beused to format text for output in
situations wherepretty-printing is desired. It offers
programmatic functionalitysimilar to the paragraph wrapping
or filling features found inmany text editors.
'''
<span style="font-family: Arial, Helvetica, sans-serif;">import textwrap</span>
print textwrap.fill(sample_text,width = 50)
fill()取文本作为输入,生成格式化的文本作为输出。
去除现有缩进,结果变得漂亮一些,删除了各行前面都有的空白符,如果某一行比其他行有更多的缩进,那么会有一些空白符没有删除。
>>> dedented_text = textwrap.dedent(sample_text)
>>> print dedented_text
The textwrap module can beused to format text for output in
situations wherepretty-printing is desired. It offers
programmatic functionalitysimilar to the paragraph wrapping
or filling features found inmany text editors.
结合dedent和fill
>>> dedented_text = textwrap.dedent(sample_text).strip()
>>> for width in [45,70]:
print "%d Columns: \n" % width
print textwrap.fill(dedented_text,width = width)
45 Columns:
The textwrap module can beused to format text
for output in situations wherepretty-
printing is desired. It offers programmatic
functionalitysimilar to the paragraph
wrapping or filling features found inmany
text editors.
70 Columns:
The textwrap module can beused to format text for output in
situations wherepretty-printing is desired. It offers programmatic
functionalitysimilar to the paragraph wrapping or filling features
found inmany text editors.
>>> print textwrap.fill(dedented_text,initial_indent = ' ',subsequent_indent = ' ' * 4, width = 50)
The textwrap module can beused to format text for
output in situations wherepretty-printing is
desired. It offers programmatic
functionalitysimilar to the paragraph wrapping
or filling features found inmany text editors.
re -- 正则表达式
查找文本中的模式
>>> import re
>>> pattern = 'this'
>>> text = "Does this text match the pattern?"
>>> match = re.search(pattern,text)
>>> match
<_sre.SRE_Match object at 0x021488A8>
>>> s = match.start()
>>> e = match.end()
>>> print 'Found "%s" \n in "%s" \n from %d to %d ("%s")' % \
(match.re.pattern, match.string, s, e, text[s:e])
Found "this"
in "Does this text match the pattern?"
from 5 to 9 ("this")
编译表达式,compile()函数会把一个表达式字符串转换为一个正则对象。
>>> regexes = [re.compile(p) for p in ['this','that']]
>>> text = "Does this text match the pattern?"
>>> print "Text: % r\n" %text
Text: 'Does this text match the pattern?'
>>> for regex in regexes:
print 'Seeking "%s" -> ' %regex.pattern
if regex.search(text):
print 'Match!'
else:
print 'Unmatch!'
Seeking "this" ->
Match!
Seeking "that" ->
Unmatch!
模块及函数会维护一编译表达式的一个缓存,不过这个缓存是有大小限制的,直接使用已编译表达式可以避免缓存查找开销。使用已编译表达式的另一个好处是,通过早加载模块时预编译所有表达式,可以把编译工作转移到应用开始时,而不是当程序响应一个用户动作时才进行编译。
多重匹配,findall()会返回输入中与模式匹配而不重复的所有子串。
>>> text = 'abbaaabbbbaaaaa'
>>> pattern = 'ab'
>>> for match in re.findall(pattern,text):
print 'Found "%s"' % match
Found "ab"
Found "ab"
参考: String Services 标准库文档