python基础（6）--字符处理

最新推荐文章于 2023-11-12 20:24:26 发布

转载最新推荐文章于 2023-11-12 20:24:26 发布 · 492 阅读

python 专栏收录该内容

50 篇文章

订阅专栏

本文介绍了Python中字符串处理的基本方法，包括提取子字符串、合并字符串、替换字符串、格式化字符串、分解字符串等，并展示了如何使用这些方法进行文本过滤。

1,字符串处理基本方法
.提取子字符串

#法1：使用切片操作

>>> 'hello world'[2:8]
'llo wo'

.合并两个字符串

#法1：使用运算符+

>>> 'hello ' + 'world'
'hello world'

#法2: 使用运算符*

>>> 'hello '*2
'hello hello '

#使用join合并多个字符串

>>> '--'.join(['a','b','c'])
'a--b--c'

.替换字符串

#法1：使用replace函数

>>> 'hello world'.replace('world', 'tom')
'hello tom'

.格式化字符串

#法1：使用格式化字符串

>>> 'hello %s %s' % ('world', 'haha')
'hello world haha'

#法2: 使用模板

>>> template = '--<html>--</html>'
>>> template = template.replace('<html>', 'start')
>>> template = template.replace('</html>', 'end')
>>> print template
--start--end

#使用模板2

>>> template = "hello %(key1)s"
>>> template = "hello %(key1)s %(key2)s"
>>> vals={'key1':'value1', 'key2':'value2'}
>>> print(template % vals)
hello value1 value2

.分解字符串

>>> 'a b c d'.split() #基本分解
['a', 'b', 'c', 'd']

#指定分解符

>>> 'a+b+c+d'.split('+')
['a', 'b', 'c', 'd']

#分解和合并混用

from sys import *
stdout.write(('.' * 4).join(stdin.read().split('\t')))

实战：

>>> stdout.write(('.' * 4).join(stdin.read().split('\t')))
aa bb cc dd
aa....bb....cc....dd

应用

#文本过滤

.基本方法

#条件程序

def isCond(astr):
'find sub string @fstr from a string @astr'
return (astr.find('root') != -1)

也可以用匿名函数：

isCond lambda astr: astr.find('root') != -1

#文本过滤第一版

def filter1(filename):
    'filter every line which read from filename'
    selected = []
    try:
        fp = open(filename)
    except IOError, e:
        print 'could not open file :', e

    for line in fp.readlines():
        if isCond(line):
            selected.append(line)
    print selected

#文本过滤第2版，使用filter内建函数

def filter2(filename):
    'filter version 2'
    selected = []
    selected = filter(isCond, open(filename).readlines())
    print selected

.使用map函数

map(isCond, open(filename).readline())