python标准库—文本

sample_text = '''

    The textwrap module can beused to format text for output in

    situations wherepretty-printing is desired.  It offers

    programmatic functionalitysimilar to the paragraph wrapping

    or filling features found inmany text editors.

'''
<span style="font-family: Arial, Helvetica, sans-serif;">import textwrap</span>
print textwrap.fill(sample_text,width = 50)

fill()取文本作为输入,生成格式化的文本作为输出。

去除现有缩进,结果变得漂亮一些,删除了各行前面都有的空白符,如果某一行比其他行有更多的缩进,那么会有一些空白符没有删除。

>>> dedented_text = textwrap.dedent(sample_text)
>>> print dedented_text


The textwrap module can beused to format text for output in

situations wherepretty-printing is desired.  It offers

programmatic functionalitysimilar to the paragraph wrapping

or filling features found inmany text editors.

结合dedent和fill

>>> dedented_text = textwrap.dedent(sample_text).strip()
>>> for width in [45,70]:
	print "%d Columns: \n" % width
	print textwrap.fill(dedented_text,width = width)

	
45 Columns: 

The textwrap module can beused to format text
for output in  situations wherepretty-
printing is desired.  It offers  programmatic
functionalitysimilar to the paragraph
wrapping  or filling features found inmany
text editors.
70 Columns: 

The textwrap module can beused to format text for output in
situations wherepretty-printing is desired.  It offers  programmatic
functionalitysimilar to the paragraph wrapping  or filling features
found inmany text editors.


悬挂缩进,不仅输出的宽度可以设置,还可以单独控制第一行的缩进,以区别后面各行。

>>> print textwrap.fill(dedented_text,initial_indent = ' ',subsequent_indent = ' ' * 4, width = 50)
 The textwrap module can beused to format text for
    output in  situations wherepretty-printing is
    desired.  It offers  programmatic
    functionalitysimilar to the paragraph wrapping
    or filling features found inmany text editors.


re -- 正则表达式

查找文本中的模式

>>> import re
>>> pattern = 'this'
>>> text = "Does this text match the pattern?"
>>> match = re.search(pattern,text)
>>> match
<_sre.SRE_Match object at 0x021488A8>
>>> s = match.start()
>>> e = match.end()

>>> print 'Found "%s" \n in "%s" \n from %d to %d ("%s")' % \
      (match.re.pattern, match.string, s, e, text[s:e])
Found "this" 
 in "Does this text match the pattern?" 
 from 5 to 9 ("this")

编译表达式,compile()函数会把一个表达式字符串转换为一个正则对象。

>>> regexes = [re.compile(p) for p in ['this','that']]
>>> text = "Does this text match the pattern?"

>>> print "Text: % r\n" %text
Text: 'Does this text match the pattern?'

>>> for regex in regexes:
	print 'Seeking "%s" -> ' %regex.pattern
	if regex.search(text):
		print 'Match!'
	else:
		print 'Unmatch!'

		
Seeking "this" -> 
Match!
Seeking "that" -> 
Unmatch!
模块及函数会维护一编译表达式的一个缓存,不过这个缓存是有大小限制的,直接使用已编译表达式可以避免缓存查找开销。使用已编译表达式的另一个好处是,通过早加载模块时预编译所有表达式,可以把编译工作转移到应用开始时,而不是当程序响应一个用户动作时才进行编译。

多重匹配,findall()会返回输入中与模式匹配而不重复的所有子串。

>>> text = 'abbaaabbbbaaaaa'
>>> pattern = 'ab'
>>> for match in re.findall(pattern,text):
	print 'Found "%s"' % match

	
Found "ab"
Found "ab"










参考: String Services 标准库文档





评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值