Python 之 字符串 str 的深入浅出
1、字符串定义
- 一个个字符组成的有序的序列,是字符的集合
- 使用单引号 双引号 三引号 引住的字符序列
- 字符串是不可变的对象
- Python3起,字符串就是Unicode类型
2、字符串初始化
print("Man")
# Man
print('\tname\t')
# name
print(str(1))
# 1
print("""This is a "String".""")
# This is a "String".
name = 'tom'
age = 18
f'{name}+++{age}'
# 'tom+++18'
3、字符串元素访问–下标
-
字符串支持使用索引访问
str1 = 'abcde' str1[0], str1[-1], str1[4], str1[-5] # ('a', 'e', 'e', 'a')
-
有序的字符集合,字符序列
str1 = 'abc' for i in str1: print(i, type(i))
a <class 'str'> b <class 'str'> c <class 'str'>
-
字符串可迭代
list1 = list('abc') list1 # ['a', 'b', 'c']
4、字符串join
连接
-
将可迭代对象连接起来,使用
string
作为分隔符 -
可迭代对象本身元素都是字符串
-
返回一个新的字符串
'string'.join(iterable, /) -> str str.join(self, iterable, /) -> str
4.1 示例
lst = ['1', '2', '3']
print('\"'.join(lst)) # 1"2"3
print('+'.join(lst)) # 1+2+3
a = ' '.join(lst)
print(a, type(a)) # 1 2 3 <class 'str'>
print(str.join('+', lst)) # 1+2+3
lst = ['1', 'a', 'b', '3']
print("".join(lst)) # 1ab3
lst = ['1',['a', 'b'], '3']
print("".join(lst)) # TypeError: sequence item 1: expected str instance, list found
5、字符串+
连接
-
将两个字符串连接在一起
-
返回一个新的字符串
-
+ -> str
'123' + 'abc' # '123abc'
6、字符串分割
6.1 split
分割
将字符串按照分隔符分割成若干个字符串,并返回列表。
6.1.1 split
split(sep=None, maxsplit=-1) -> list of strings
1、从左至右
2、sep 指定分割字符串,缺省的情况下空白字符作为分隔符
3、maxsplit 指定分割的次数,-1表示遍历整个字符串
# 注意缺省情况下的空格和单个空格的区别
str1 = "I'm \ta super student."
print(str1) # I'm a super student.
str1.split() # ["I'm", 'a', 'super', 'student.']
str1.split('s') # ["I'm \ta ", 'uper ', 'tudent.']
str1.split('super') # ["I'm \ta ", ' student.']
str1.split(' ') # ["I'm", '\ta', 'super', 'student.']
str1.split(' ', maxsplit=1) # ["I'm", '\ta super student.']
str1.split('\t', maxsplit=2) # ["I'm ", 'a super student.']
6.1.2 rsplit
rsplit(sep=None, maxsplit=-1) -> list of strings
1、从右至左
2、sep 指定分割字符串,缺省的情况下空白字符作为分隔符
3、maxsplit 指定分割的次数,-1表示遍历整个字符串
str1 = "I'm \ta super student."
print(str1) # I'm a super student.
str1.rsplit() # ["I'm", 'a', 'super', 'student.']
str1.rsplit('s') # ["I'm \ta ", 'uper ', 'tudent.']
str1.rsplit('super') # ["I'm \ta ", ' student.']
str1.rsplit(' ') # ["I'm", '\ta', 'super', 'student.']
str1.rsplit(' ', maxsplit=1) # ["I'm \ta super", 'student.']
str1.rsplit('\t', maxsplit=2) # ["I'm ", 'a super student.']
6.1.2 splitlines
splitlines([keepends) -> list of strings
1、按照行来切分字符串
2、keepends 指的是是否保留行分隔符
3、行分隔符包括 \n \r\n \r 等
6.2 partition
分割
将字符串按照分隔符分割成2段,返回这2段和分隔符的元组。
str.partition(self, sep, /) -> (head, sep, tail)
1、从左至右,遇到分隔符就把字符串分割成两部分
2、返回 头 分隔符 尾 三部分的三元组
3、如果没有找到分隔符,就返回 头 和 两个空元素 的三元组
4、sep 分割字符串,必须指定
str.rpartition(self, sep, /) -> (head, sep, tail)
1、从右至左,遇到分隔符就把字符串分割成两部分
2、返回 头 分隔符 尾 三部分的三元组
3、如果没有找到分隔符,就返回 尾 和 两个空元素 的三元组
4、sep 分割字符串,必须指定
s1 = "I am a super student."
s1.partition('s') # ('I am a ', 's', 'uper student.')
s1.partition('stu') # ('I am a super ', 'stu', 'dent.')
s1.partition(' ') # ('I', ' ', 'am a super student.')
s1.partition('abc') # ('I am a super student.', '', '')
s1.partition('.') # ('I am a super student', '.', '')
7、字符串大小写
upper() # 全大写
isupper() # 是否大写
lower() # 全小写
islower() # 是否小写
swapcase() # 交换大小写
8、字符串排版
# width 打印宽度
# fillchar 填充的字符
title() -> str # 标题的每次单词都大写
capitalize() -> str # 首个单词大写
center(width [,fillchar]) -> str # 居中打印
zfill(width) -> str # 居右,左边用0填充
ljust(width [,fillchar]) -> str # 左对齐
rjust(width [,fillchar]) -> str # 右对齐
9、字符串修改
9.1 replace
str.replace(self, old, new, count=-1, /) -> str
1、字符串中找到匹配替换为新字串,返回新字符串
2、count表示替换几次,不指定就是全部替换
9.2 strip
str.strip(self, chars=None, /)
1、从字符串两端去除指定的字符集chars中的所有字符
2、如果chars没有指定,去除两端的空白字符
3、lstrip 从左边开始
4、rstrip 从右边开始
9、字符串查找
9.1 find
S.find(sub[, start[, end]]) -> int
1、在指定的区间[start, end]
2、从左至右,查找字串
3、找到返回索引,没有找到返回-1
S.rfind(sub[, start[, end]]) -> int
1、在指定的区间[start, end]
2、从右至左,查找字串
3、找到返回索引,没有找到返回-1
9.2 index
S.index(sub[, start[, end]]) -> int
1、在指定的区间[start, end]
2、从左至右,查找字串
3、找到返回索引,没有找到抛出 ValueError
S.rindex(sub[, start[, end]]) -> int
1、在指定的区间[start, end]
2、从右至左,查找字串
3、找到返回索引,没有找到抛出 ValueError
9.3 count
S.count(sub[, start[, end]]) -> int
1、在指定的区间[start, end]
2、从左至右,查找字串
3、统计字串sub出现的次数
9.4 字符串查找总结
index
和count
方法的时间复杂度都是O(n)
- 随着数据规模的增大,效率下降
len(string)
返回字符串的长度,即字符的个数
10、字符串判断
S.startswith(prefix[, start[, end]]) -> bool
1、在指定的区间[start, end)
2、判断字符串是否以prefix开头
S.endswith(suffix[, start[, end]]) -> bool
1、在指定的区间[start, end)
2、判断字符串是否以suffix结尾
str.isalnum(self, /) # 判断是否是字母和数字组成
str.isalpha(self, /) # 判断是否是字母
str.isdecimal(self, /) # 判断是否只包含十进制数字
str.isdigit(self, /) # 判断是否是全部数字
str.isidentifier(self, /) # 判断是不是字母和下划线开头,其他都是字母数字下划线
str.islower(self, /) # 判断是否都是小写
str.isupper(self, /) # 判断是否都是大写
str.isspace(self, /) # 判断是否只包含空白字符
11、字符串格式化
11.1 格式化介绍
- 字符串的格式化是一种拼接字符串输出样式的手段,更灵活方便
join
拼接只能使用分隔符,且要求被拼接的是可迭代对象且其元素是字符串+
拼接字符串还算方便,但是非字符串需要转换为字符串才能拼接
11.2 printf style
风格
- 占位符:使用
%
和格式字符组成,例如%s %d
等 s
调用str()
,r
会调用repr()
,所有对象都可以被这两个转换- 占位符中还可以插入修饰字符,例如
%03d
表示打印三个位置,不够前面补零 format % values
格式字符串和被格式的值之间使用%
分隔values
只能是一个对象,或者一个与格式字符串占位符数目相等的元组,或一个字典
11.3 format
函数格式化字符串语法(鼓励使用)
"{} {xxx}".format(*args, **kwargs) -> str
1、args 是可变位置参数,是一个元组
2、kwargs 是可变关键字参数,是一个字典
3、花括号表示占位符
4、{}表示按照顺序匹配位置参数,{n}表示取位置参数索引为n的值
5、{xxx}表示在关键字参数中搜索名称一致的
6、{{}}表示打印花括号
-
位置参数:按照位置顺序用位置参数替换前面的格式化字符串中的占位符
"{}:{}".format('192.168.1.1', 8888)
'192.168.1.1:8888'
-
关键字参数或命名参数:位置参数按照序号匹配,关键字参数按照名词匹配
"{server} {1}:{0}".format(8888, '192.168.1.1', server='Web Server Info:')
'Web Server Info: 192.168.1.1:8888'
-
访问元素
"{0[0]}.{0[1]}.{1[0]}".format(('Lee', 'JCFive'), ('KissDa', ))
'Lee.JCFive.KissDa'
-
对象属性访问
from collections import namedtuple Point = namedtuple('Point', 'x y') p = Point(4, 6) "{{{0.x}, {0.y}}}".format(p)
'{4, 6}'
-
对齐
-
进制
-
浮点数(注意宽度可以被撑破)
11.4 string format method
#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.
#Use "<" to left-align the value:
txt = "We have {:<8} chickens."
print(txt.format(49))
# We have 49 chickens.
#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.
#Use ">" to right-align the value:
txt = "We have {:>8} chickens."
print(txt.format(49))
# We have 49 chickens.
#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.
#Use "^" to center-align the value:
txt = "We have {:^8} chickens."
print(txt.format(49))
# We have 49 chickens.
#To demonstrate, we insert the number 8 to specify the available space for the value.
#Use "=" to place the plus/minus sign at the left most position:
txt = "The temperature is {:=8} degrees celsius."
print(txt.format(-5))
# The temperature is - 5 degrees celsius.
#To demonstrate, we insert the number 8 to specify the available space for the value.
#Use "=" to place the plus/minus sign at the left most position:
txt = "The temperature is {:=8} degrees celsius."
print(txt.format(-5))
# The temperature is - 5 degrees celsius.
#Use "-" to always indicate if the number is negative (positive numbers are displayed without any sign):
txt = "The temperature is between {:-} and {:-} degrees celsius."
print(txt.format(-3, 7))
# The temperature is between -3 and 7 degrees celsius.
#Use " " (a space) to insert a space before positive numbers and a minus sign before negative numbers:
txt = "The temperature is between {: } and {: } degrees celsius."
print(txt.format(-3, 7))
# The temperature is between -3 and 7 degrees celsius.
#Use "," to add a comma as a thousand separator:
txt = "The universe is {:,} years old."
print(txt.format(13800000000))
# The universe is 13,800,000,000 years old.
#Use "_" to add a underscore character as a thousand separator:
txt = "The universe is {:_} years old."
print(txt.format(13800000000))
# The universe is 13_800_000_000 years old.
#Use "b" to convert the number into binary format:
txt = "The binary version of {0} is {0:b}"
print(txt.format(5))
# The binary version of 5 is 101
#Use "d" to convert a number, in this case a binary number, into decimal number format:
txt = "We have {:d} chickens."
print(txt.format(0b101))
# We have 5 chickens.
#Use "e" to convert a number into scientific number format (with a lower-case e):
txt = "We have {:e} chickens."
print(txt.format(5))
# We have 5.000000e+00 chickens.
#Use "E" to convert a number into scientific number format (with an upper-case E):
txt = "We have {:E} chickens."
print(txt.format(5))
# We have 5.000000E+00 chickens.
#Use "f" to convert a number into a fixed point number, default with 6 decimals, but use a period followed by a number to specify the number of decimals:
txt = "The price is {:.2f} dollars."
print(txt.format(45))
# The price is 45.00 dollars.
#without the ".2" inside the placeholder, this number will be displayed like this:
txt = "The price is {:f} dollars."
print(txt.format(45))
# The price is 45.000000 dollars.
#Use "F" to convert a number into a fixed point number, but display inf and nan as INF and NAN:
x = float('inf')
txt = "The price is {:F} dollars."
print(txt.format(x))
# The price is INF dollars.
#same example, but with a lower case f:
txt = "The price is {:f} dollars."
print(txt.format(x))
# The price is inf dollars.
#Use "o" to convert the number into octal format:
txt = "The octal version of {0} is {0:o}"
print(txt.format(10))
# The octal version of 10 is 12
#Use "x" to convert the number into Hex format:
txt = "The Hexadecimal version of {0} is {0:x}"
print(txt.format(255))
# The Hexadecimal version of 255 is ff
#Use "X" to convert the number into upper-case Hex format:
txt = "The Hexadecimal version of {0} is {0:X}"
print(txt.format(255))
# The Hexadecimal version of 255 is FF
#Use "%" to convert the number into a percentage format:
txt = "You scored {:%}"
print(txt.format(0.25)) # You scored 25.000000%
#Or, without any decimals:
txt = "You scored {:.0%}"
print(txt.format(0.25)) # You scored 25%