Python学习笔记(二) 字符串

最新推荐文章于 2024-11-18 07:53:19 发布

半虹

最新推荐文章于 2024-11-18 07:53:19 发布

阅读量509

点赞数

分类专栏： Python 文章标签： Python 字符串

本文链接：https://blog.youkuaiyun.com/wsmrzx/article/details/84206472

版权

Python 专栏收录该内容

15 篇文章

订阅专栏

这篇文章将会介绍 Python 的字符串，主要讲解字符串的常用方法和格式设置，字符串也是一种序列类型

1、通用序列操作

（1）创建字符串

Python 允许我们使用单引号 '' 或双引号 "" 创建字符串，只要左右两边的引号保持一致就可以

>>> a = 'Hello Wrold'
>>> b = "Hello Python"

当然，我们也可以使用内置函数 str() 来创建字符串或将其他类型转化为字符串

>>> li = ['I', 'Love', 'Python']
>>> c = str(li)      # "['I', 'Love', 'Python']"
>>> # 不过这好像不太符合我们的预期， 我们更希望将序列中的每一个元素合并成一个新字符串
>>> # 为此，可以使用字符串的内置方法 join 达到这种效果
>>> d = ' '.join(li) # 'I Love Python'

转义字符：在某些字符前加上 \，能表达特殊的含义，例如 \n 表示换行，\t 表示缩进

>>> string = 'Hello\nMy\tfriend'
>>> print(string)
# Hello
# My	friend

原始字符串：在引号前面加上 r，表示不对字符串的内容进行转义

>>> string = r'Hello\nMy\tfriend'
>>> print(string)
# Hello\nMy\tfriend

长字符串：用三引号代替普通引号，表示保留文本的原格式，适用于篇幅较长的文段

>>> string = '''To see a world in a grain of sand,
And a heaven in a wild flower,
Hold infinity in the palm of your hand,
And eternity in an hour.'''
>>> print(string)
# To see a world in a grain of sand,
# And a heaven in a wild flower,
# Hold infinity in the palm of your hand,
# And eternity in an hour.

（2）索引与切片

与列表相似，str 类型可以使用 [] 对其中某一个字符进行索引，或对其中某一段字符串进行切片

通过切片得到的字符串，是原字符串的一份副本，而非原字符串本身，所以切片操作并不会影响原字符串

>>> string = 'Never give up, Never lose hope.'
>>> string[13]  # ','
>>> string[:13] # 'Never give up'

（3）序列操作符

可以使用 + 拼接字符串

>>> base = 'D:\\'
>>> path = 'Document'
>>> base + path # 'D:\\Document'

可以使用 * 重复字符串

>>> item = 'wh'
>>> item * 5 # 'whwhwhwhwh'

（4）不可变序列

字符串其实是不可变序列，一旦创建不可修改，但有的同学可能会问，那下面这个字符串拼接是怎么回事呢

>>> string = '123'
>>> string += '45'
>>> string # 12345

这看起来的确像是我们修改了 123，将它变成了 12345，可实际上并不是这样的

事实上，我们没有修改字符串 123，而是重新创建一个新的字符串 12345，并改变变量 string 的指向

请大家看下面例子，可以发现 string 两次输出的 id 是不一样的

>>> string = '123'
>>> id(string)
# 58561952
>>> string += '45'
>>> id(string)
# 58561888

2、字符串方法

字符串是 Python 中最为常用的变量类型之一，有许多内置的方法，下面我们将会逐一进行讲解

建议大家亲自动手敲一遍示例代码，理解各个方法的使用场景和限制

不过吧，就算记不住所有的用法也是正常的，毕竟字符串的方法多而繁杂，在实际使用的时候查查手册就好

capitalize() ：将字符串的第一个字符转换为大写字母
title()：将所有单词的第一个字符转换为大写字母
upper()：将字符串的所有字符转化为大写字母
lower()：将字符串的所有字符转化为小写字母
swapcase() ：将字符串的所有字符大小写互换

>>> string = 'while There is life There is hope.'

>>> string.capitalize()
# 'While there is life there is hope.'
>>> string.title()
# 'While There Is Life There Is Hope.'
>>> string.upper()
# 'WHILE THERE IS LIFE THERE IS HOPE.'
>>> string.lower()
# 'while there is life there is hope.'
>>> string.swapcase()
# 'WHILE tHERE IS LIFE tHERE IS HOPE.'

center(width[, fillchar])：将字符串居中，并用 fillchar（默认为空格）填充至 width
ljust(width[, fillchar]) ：将字符串靠左，并用 fillchar（默认为空格）填充至 width
rjust(width[, fillchar]) ：将字符串靠右，并用 fillchar（默认为空格）填充至 width

注意：如果 width 小于字符串宽度则直接返回字符串，不会截断输出

>>> title = 'Title Here'
>>> title.center(50, '-')
# '--------------------Title Here--------------------'
>>> lpara = 'Para Start Here'
>>> lpara.ljust(50, '-')
# 'Para Start Here-----------------------------------'
>>> rpara = 'Para End Here'
>>> rpara.rjust(50, '-')
# '-------------------------------------Para End Here'

strip([char]) ：去掉字符串前后的 char（默认为空格或换行）
lstrip([char])：去掉字符串前的 char（默认为空格或换行）
rstrip([char])：去掉字符串后的 char（默认为空格或换行）

注意：该方法只能处理字符串开头或结尾的字符，不能用于处理中间部分的字符

>>> string = '000000100010'

>>> string.strip('0')
# '10001
>>> string.lstrip('0')
# '100010'
>>> string.rstrip('0')
# '00000010001'

split([sep [,maxsplit]])：按照 sep（默认空格或换行）对字符串进行分割
maxsplit 用于指定最大分割次数，若大于指定次数则不再分割，默认为 -1，表示全部分割
splitlines([keepends])：按照换行符（\r、\n、\r\n）对字符串进行分割
keepends 默认为 False，表示分割出来的子串不保留换行符，若为 True，则保留换行符

>>> string = 'do what you say \n say what you do'

>>> string.split()
# ['do', 'what', 'you', 'say', 'say', 'what', 'you', 'do']
>>> string.splitlines()
# ['do what you say ', ' say what you do']

partition(sep) ：从左到右寻找第一次出现的 sep 对字符串进行分隔，返回一个三元元组
三元元组的元素，第一个为分隔符左边的子串，第二个为分隔符本身，第三个为分隔符右边的子串
rpartition(sep)：从右到左寻找第一次出现的 sep 对字符串进行分隔，返回一个三元元组
三元元组的元素，第一个为分隔符左边的子串，第二个为分隔符本身，第三个为分隔符右边的子串

>>> string = 'do what you say \n say what you do'

>>> string.partition('what')
# ('do ', 'what', ' you say \n say what you do')
>>> string.rpartition('what')
# ('do what you say \n say ', 'what', ' you do')

find(sub [,start [,end]]) ：从左到右检查 sub 是否包含在字符串从 start 到 end 范围中
返回第一个符合要求的子串的索引值，若没有找到则返回 -1
rfind(sub [,start [,end]])：从右到左检查 sub 是否包含在字符串从 start 到 end 范围中
返回第一个符合要求的子串的索引值，若没有找到则返回 -1

>>> string = 'abcabcdabcde'

>>> string.find('cd')
# 5
>>> string.rfind('cd')
# 9

beginswith(sub [,start [,end]])：检查字符串从 start 到 end 范围中是否以 sub 开始
endswith(sub [,start [,end]]) ：检查字符串从 start 到 end 范围中是否以 sub 结束

>>> string = 'Whatever happens, happens for a reason.'

>>> string.startswith('What')
# True
>>> string.endswith('ending')
# False

join(sequence)：将字符串插入到 sequence 每两两相邻的字符中

>>> string = 'LOVE'
>>> string.join('python')
# 'pLOVEyLOVEtLOVEhLOVEoLOVEn'

count(sub [,start [, end]])：寻找字符串中从 start 到 end 之间 sub 出现的次数

>>> string = 'Hello World'
>>> string.count('o')
# 2

replace(old, new[, max])：将字符串中 old 子串转换为 new 子串，最大替换次数为 max

>>> string = 'Hello Alice, Hello Bob.'
>>> string.replace('Hello', 'Goodbye')
# 'Goodbye Alice, Hello Bob.'

translate(table)：根据 table 制定的规则替换字符

>>> string = 'This is an example'
>>> table = str.maketrans('aeiou', '12345')
>>> string.translate(table)
'Th3s 3s 1n 2x1mpl2'

3、格式设置

（1）格式字符串

对于字符串的格式设置，在 Python 的早期解决方案中，主要使用类似 C 语言中的经典函数 printf

在格式字符串中使用转换说明符表示待插入值的位置、类型和格式，在格式字符串后写出待插入的值

>>> string = 'Hello %s. Hello %s.' % ('Alice', 'Bob')
>>> string
# 'Hello Alice. Hello Bob.'

上述格式字符串中的 %s 称为转换说明符，s 表示待插入的值是字符串类型，常用的格式说明符包括

转换说明符	说明
`%c`	字符
`%s`	字符串
`%d`	使用十进制表示的整数
`%e`	使用科学记数法表示的小数
`%f`	使用定点表示法表示的小数
`%g`	根据数的大小决定确定使用科学记数法还是定点表示法

>>> string = '%s is %d years old' % ('Helen', 18)
>>> string
# 'Helen is 18 years old'

（2）字符串方法

后来 Python 采用一种新的解决方案，那就是字符串方法 format()

在格式字符串中用花括号表示待插入值的位置、索引名称和格式，在 format 方法参数中写出待插入的值

# 若在花括号中没有用于索引的名称，则默认使用位置参数
>>> string = 'Hello {}. Hello {}.'.format('Alice', 'Bob')
>>> string
# 'Hello Alice. Hello Bob.'

# 位置参数能通过数字指定参数的位置，无需按照顺序排列
>>> string = '{3} {0} {2} {1} {3} {0}'.format('be', 'not', 'or', 'to')
>>> string
# 'to be or not to be'

# 若在花括号中已有用于索引的名称，则会使用关键字参数
>>> string = '{name} is {age} years old'.format(name = 'Helen', age = 18)
>>> string
# 'Helen is 18 years old'

① 转换标志：跟在感叹号后的单字符，表示用对应的格式转换给定的值

转换标志	含义
`r`	创建给定值的原始字符串表示（repr）
`s`	创建给定值的普通字符串版本（str）
`a`	创建给定值的 ASCII 字符表示（ascii）

>>> print('{pi!r}\n{pi!s}\n{pi!a}'.format(pi = 'π'))
# 'π'
# π
# '\u03c0'

② 格式说明符：跟在冒号后的表达式，用于详细指定字符串的格式

类型说明符	说明
`d`	将整数表示为十进制数（这是整数默认使用的说明符）
`b`	将整数表示为二进制数
`o`	将整数表示为八进制数
`x`	将整数表示为十六进制数
`g`	自动在定点表示法和科学记数法之间做出选择（这是小数默认使用的说明符）
`e`	将小数表示为科学记数法
`f`	将小数表示为定点表示法
`s`	保持字符串的格式不变（这是字符串默认使用的说明符）
`c`	将字符表示为 Unicode 码点

>>> # 指定表示方式
>>> print('in decimal: {0:d}\nin binary : {0:b}'.format(10))
# in decimal: 10
# in binary : 1010
>>> print('in fixed-point notation: {0:f}\nin scientific  notation: {0:e}'.format(0.25))
# in fixed-point notation: 0.250000
# in scientific  notation: 2.500000e-01

>>> # 指定精度
>>> print("{0:.2f}".format(1/3))
# 0.33

>>> # 指定宽度，默认右对齐
>>> print("{0:10.2f}".format(1/4))
#       0.25

>>> # 指定对齐方式，默认用空格填充
>>> print("{0:<10.2f}\n{0:^10.2f}\n{0:>10.2f}".format(1/6))
# 0.17      
#    0.17   
#       0.17

>>> # 指定对齐方式，并指定填充字符
>>> print("{0:*<10.2f}\n{0:*^10.2f}\n{0:*>10.2f}".format(1/7))
# 0.14******
# ***0.14***
# ******0.14