Python 之字符串 str 的深入浅出

Lee木木

已于 2022-04-22 11:05:22 修改

阅读量886

点赞数 1

分类专栏： Python 文章标签： python

于 2021-02-14 17:29:52 首次发布

本文链接：https://blog.youkuaiyun.com/weixin_44983653/article/details/113808027

版权

Python 专栏收录该内容

63 篇文章

订阅专栏

本文详细介绍了Python中的字符串操作，包括定义、初始化、访问、连接、分割、大小写转换、排版、查找、格式化等，特别强调了字符串的不可变性和Unicode特性，以及推荐使用函数格式化字符串的方法。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1、字符串定义

一个个字符组成的有序的序列，是字符的集合
使用单引号双引号三引号引住的字符序列
字符串是不可变的对象
Python3起，字符串就是Unicode类型

2、字符串初始化

print("Man")
# Man

print('\tname\t')
# 	name

print(str(1))
# 1

print("""This is a "String".""")
# This is a "String".

name = 'tom'
age = 18
f'{name}+++{age}'
# 'tom+++18'

3、字符串元素访问–下标

字符串支持使用索引访问

str1 = 'abcde'
str1[0], str1[-1], str1[4], str1[-5]  # ('a', 'e', 'e', 'a')

有序的字符集合，字符序列

str1 = 'abc'
for i in str1:
    print(i, type(i))

a <class 'str'>
b <class 'str'>
c <class 'str'>

字符串可迭代

list1 = list('abc')
list1  # ['a', 'b', 'c']

4、字符串`join`连接

将可迭代对象连接起来，使用string作为分隔符
可迭代对象本身元素都是字符串

返回一个新的字符串

'string'.join(iterable, /)   ->  str
str.join(self, iterable, /)  ->  str

4.1 示例

lst = ['1', '2', '3']
print('\"'.join(lst))      # 1"2"3
print('+'.join(lst))       # 1+2+3
a = ' '.join(lst)
print(a, type(a))          # 1 2 3 <class 'str'>
print(str.join('+', lst))  # 1+2+3
lst = ['1', 'a', 'b', '3']
print("".join(lst))        # 1ab3
lst = ['1',['a', 'b'], '3']
print("".join(lst))        # TypeError: sequence item 1: expected str instance, list found

5、字符串`+`连接

将两个字符串连接在一起
返回一个新的字符串
+ -> str
```
'123' + 'abc'   # '123abc'
```

6、字符串分割

6.1 `split` 分割

将字符串按照分隔符分割成若干个字符串，并返回列表。

6.1.1 `split`

split(sep=None, maxsplit=-1)  ->  list of strings
  1、从左至右
  2、sep 指定分割字符串，缺省的情况下空白字符作为分隔符
  3、maxsplit 指定分割的次数，-1表示遍历整个字符串

# 注意缺省情况下的空格和单个空格的区别
str1 = "I'm \ta super student." 
print(str1)  # I'm 	a super student.
str1.split()  # ["I'm", 'a', 'super', 'student.']
str1.split('s')  # ["I'm \ta ", 'uper ', 'tudent.']
str1.split('super')  # ["I'm \ta ", ' student.']
str1.split(' ')  # ["I'm", '\ta', 'super', 'student.']
str1.split(' ', maxsplit=1)  # ["I'm", '\ta super student.']
str1.split('\t', maxsplit=2)  # ["I'm ", 'a super student.']

6.1.2 `rsplit`

rsplit(sep=None, maxsplit=-1)  ->  list of strings
  1、从右至左
  2、sep 指定分割字符串，缺省的情况下空白字符作为分隔符
  3、maxsplit 指定分割的次数，-1表示遍历整个字符串

str1 = "I'm \ta super student." 
print(str1)  # I'm 	a super student.
str1.rsplit()  # ["I'm", 'a', 'super', 'student.']
str1.rsplit('s')  # ["I'm \ta ", 'uper ', 'tudent.']
str1.rsplit('super')  # ["I'm \ta ", ' student.']
str1.rsplit(' ')  # ["I'm", '\ta', 'super', 'student.']
str1.rsplit(' ', maxsplit=1)  # ["I'm \ta super", 'student.']
str1.rsplit('\t', maxsplit=2)  # ["I'm ", 'a super student.']

6.1.2 `splitlines`

splitlines([keepends)  ->  list of strings
  1、按照行来切分字符串
  2、keepends 指的是是否保留行分隔符
  3、行分隔符包括  \n  \r\n  \r  等

在这里插入图片描述

6.2 `partition`分割

将字符串按照分隔符分割成2段，返回这2段和分隔符的元组。

str.partition(self, sep, /)  ->  (head, sep, tail)
  1、从左至右，遇到分隔符就把字符串分割成两部分
  2、返回 头 分隔符 尾 三部分的三元组
  3、如果没有找到分隔符，就返回 头 和 两个空元素 的三元组
  4、sep 分割字符串，必须指定
  
str.rpartition(self, sep, /)  ->  (head, sep, tail)
  1、从右至左，遇到分隔符就把字符串分割成两部分
  2、返回 头 分隔符 尾 三部分的三元组
  3、如果没有找到分隔符，就返回 尾 和 两个空元素 的三元组
  4、sep 分割字符串，必须指定

s1 = "I am a super student."
s1.partition('s')  # ('I am a ', 's', 'uper student.')
s1.partition('stu')  # ('I am a super ', 'stu', 'dent.')
s1.partition(' ')  # ('I', ' ', 'am a super student.')
s1.partition('abc')  # ('I am a super student.', '', '')
s1.partition('.')  # ('I am a super student', '.', '')

7、字符串大小写

upper()  # 全大写
isupper()  # 是否大写
lower()  # 全小写
islower()  # 是否小写
swapcase()  # 交换大小写

8、字符串排版

# width 打印宽度
# fillchar 填充的字符
title() ->  str  # 标题的每次单词都大写
capitalize() ->  str  # 首个单词大写
center(width [,fillchar]) ->  str  # 居中打印
zfill(width) ->  str  # 居右，左边用0填充
ljust(width [,fillchar]) ->  str  # 左对齐
rjust(width [,fillchar]) ->  str  # 右对齐

在这里插入图片描述

9、字符串修改

9.1 `replace`

str.replace(self, old, new, count=-1, /) -> str
  1、字符串中找到匹配替换为新字串，返回新字符串
  2、count表示替换几次，不指定就是全部替换

在这里插入图片描述

9.2 `strip`

str.strip(self, chars=None, /)
  1、从字符串两端去除指定的字符集chars中的所有字符
  2、如果chars没有指定，去除两端的空白字符
  3、lstrip 从左边开始
  4、rstrip 从右边开始

在这里插入图片描述

9、字符串查找

9.1 `find`

S.find(sub[, start[, end]]) -> int
  1、在指定的区间[start, end]
  2、从左至右，查找字串
  3、找到返回索引，没有找到返回-1
S.rfind(sub[, start[, end]]) -> int
  1、在指定的区间[start, end]
  2、从右至左，查找字串
  3、找到返回索引，没有找到返回-1

在这里插入图片描述

9.2 `index`

S.index(sub[, start[, end]]) -> int
  1、在指定的区间[start, end]
  2、从左至右，查找字串
  3、找到返回索引，没有找到抛出 ValueError
S.rindex(sub[, start[, end]]) -> int
  1、在指定的区间[start, end]
  2、从右至左，查找字串
  3、找到返回索引，没有找到抛出 ValueError

在这里插入图片描述

9.3 `count`

S.count(sub[, start[, end]]) -> int
  1、在指定的区间[start, end]
  2、从左至右，查找字串
  3、统计字串sub出现的次数

在这里插入图片描述

9.4 字符串查找总结

index 和 count 方法的时间复杂度都是 O(n)
随着数据规模的增大，效率下降
len(string) 返回字符串的长度，即字符的个数

10、字符串判断

S.startswith(prefix[, start[, end]]) -> bool
	1、在指定的区间[start, end)
	2、判断字符串是否以prefix开头
S.endswith(suffix[, start[, end]]) -> bool
	1、在指定的区间[start, end)
	2、判断字符串是否以suffix结尾

str.isalnum(self, /)  # 判断是否是字母和数字组成
str.isalpha(self, /)  # 判断是否是字母
str.isdecimal(self, /)  # 判断是否只包含十进制数字
str.isdigit(self, /)  # 判断是否是全部数字
str.isidentifier(self, /)  # 判断是不是字母和下划线开头，其他都是字母数字下划线
str.islower(self, /)  # 判断是否都是小写
str.isupper(self, /)  # 判断是否都是大写
str.isspace(self, /)  # 判断是否只包含空白字符

11、字符串格式化

11.1 格式化介绍

字符串的格式化是一种拼接字符串输出样式的手段，更灵活方便
join拼接只能使用分隔符，且要求被拼接的是可迭代对象且其元素是字符串
+拼接字符串还算方便，但是非字符串需要转换为字符串才能拼接

11.2 `printf style`风格

占位符：使用%和格式字符组成，例如%s %d等
s调用str(), r会调用repr()，所有对象都可以被这两个转换
占位符中还可以插入修饰字符，例如%03d表示打印三个位置，不够前面补零
format % values 格式字符串和被格式的值之间使用%分隔
values只能是一个对象，或者一个与格式字符串占位符数目相等的元组，或一个字典

在这里插入图片描述

11.3 `format`函数格式化字符串语法（鼓励使用）

"{} {xxx}".format(*args, **kwargs) -> str
  1、args 是可变位置参数，是一个元组
  2、kwargs 是可变关键字参数，是一个字典
  3、花括号表示占位符
  4、{}表示按照顺序匹配位置参数，{n}表示取位置参数索引为n的值
  5、{xxx}表示在关键字参数中搜索名称一致的
  6、{{}}表示打印花括号

位置参数：按照位置顺序用位置参数替换前面的格式化字符串中的占位符
```
"{}:{}".format('192.168.1.1', 8888)
```
```
'192.168.1.1:8888'
```

关键字参数或命名参数：位置参数按照序号匹配，关键字参数按照名词匹配

"{server} {1}:{0}".format(8888, '192.168.1.1', server='Web Server Info:')

'Web Server Info: 192.168.1.1:8888'

访问元素

"{0[0]}.{0[1]}.{1[0]}".format(('Lee', 'JCFive'), ('KissDa', ))

'Lee.JCFive.KissDa'

对象属性访问

from collections import namedtuple
Point = namedtuple('Point', 'x y')
p = Point(4, 6)
"{{{0.x}, {0.y}}}".format(p)

'{4, 6}'

对齐
进制
浮点数（注意宽度可以被撑破）

在这里插入图片描述

11.4 `string format method`

#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.

#Use "<" to left-align the value:

txt = "We have {:<8} chickens."
print(txt.format(49))
# We have 49       chickens.

#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.

#Use ">" to right-align the value:

txt = "We have {:>8} chickens."
print(txt.format(49))
# We have       49 chickens.

#To demonstrate, we insert the number 8 to set the available space for the value to 8 characters.

#Use "^" to center-align the value:

txt = "We have {:^8} chickens."
print(txt.format(49))
# We have    49    chickens.

#To demonstrate, we insert the number 8 to specify the available space for the value.

#Use "=" to place the plus/minus sign at the left most position:

txt = "The temperature is {:=8} degrees celsius."

print(txt.format(-5))
# The temperature is -      5 degrees celsius.

#To demonstrate, we insert the number 8 to specify the available space for the value.

#Use "=" to place the plus/minus sign at the left most position:

txt = "The temperature is {:=8} degrees celsius."

print(txt.format(-5))
# The temperature is -      5 degrees celsius.

#Use "-" to always indicate if the number is negative (positive numbers are displayed without any sign):

txt = "The temperature is between {:-} and {:-} degrees celsius."

print(txt.format(-3, 7))
# The temperature is between -3 and 7 degrees celsius.

#Use " " (a space) to insert a space before positive numbers and a minus sign before negative numbers:

txt = "The temperature is between {: } and {: } degrees celsius."

print(txt.format(-3, 7))
# The temperature is between -3 and  7 degrees celsius.

#Use "," to add a comma as a thousand separator:

txt = "The universe is {:,} years old."

print(txt.format(13800000000))
# The universe is 13,800,000,000 years old.

#Use "_" to add a underscore character as a thousand separator:

txt = "The universe is {:_} years old."

print(txt.format(13800000000))
# The universe is 13_800_000_000 years old.

#Use "b" to convert the number into binary format:

txt = "The binary version of {0} is {0:b}"

print(txt.format(5))
# The binary version of 5 is 101

#Use "d" to convert a number, in this case a binary number, into decimal number format:

txt = "We have {:d} chickens."
print(txt.format(0b101))
# We have 5 chickens.

#Use "e" to convert a number into scientific number format (with a lower-case e):

txt = "We have {:e} chickens."
print(txt.format(5))
# We have 5.000000e+00 chickens.

#Use "E" to convert a number into scientific number format (with an upper-case E):

txt = "We have {:E} chickens."
print(txt.format(5))
# We have 5.000000E+00 chickens.

#Use "f" to convert a number into a fixed point number, default with 6 decimals, but use a period followed by a number to specify the number of decimals:

txt = "The price is {:.2f} dollars."
print(txt.format(45))
# The price is 45.00 dollars.

#without the ".2" inside the placeholder, this number will be displayed like this:

txt = "The price is {:f} dollars."
print(txt.format(45))
# The price is 45.000000 dollars.

#Use "F" to convert a number into a fixed point number, but display inf and nan as INF and NAN:

x = float('inf')

txt = "The price is {:F} dollars."
print(txt.format(x))
# The price is INF dollars.

#same example, but with a lower case f:

txt = "The price is {:f} dollars."
print(txt.format(x))
# The price is inf dollars.

#Use "o" to convert the number into octal format:

txt = "The octal version of {0} is {0:o}"

print(txt.format(10))
# The octal version of 10 is 12

#Use "x" to convert the number into Hex format:

txt = "The Hexadecimal version of {0} is {0:x}"

print(txt.format(255))
# The Hexadecimal version of 255 is ff

#Use "X" to convert the number into upper-case Hex format:

txt = "The Hexadecimal version of {0} is {0:X}"

print(txt.format(255))  
# The Hexadecimal version of 255 is FF

#Use "%" to convert the number into a percentage format:

txt = "You scored {:%}"
print(txt.format(0.25)) # You scored 25.000000%

#Or, without any decimals:

txt = "You scored {:.0%}"
print(txt.format(0.25))  # You scored 25%