python全栈开发《66.不同数据类型间的转换：字符串与bytes通过编解码进行转换》...-优快云博客

本文链接：https://blog.youkuaiyun.com/weixin_41033105/article/details/143971307

1）二进制的数据流：bytes（比特）
2）是一种特殊的字符串。（因为它长得几乎和字符串一模一样，同时也拥有字符串的几乎所有的内置函数。完全可以像操作字符串一样操作比特类型。只不过它和字符串在外观上稍微有点不同。）
3）在字符串前+b的标记，就是比特类型。

bt = b'my name is dewei'
print(type(bt))

运行结果：

<class 'bytes'>

代码

例1:

# coding:utf-8

a =  'hello xiaomu'
print(a,type(a))

b = b'hello xiaomu'
print(b,type(b))

print(b.capitalize())
print(b.replace(b'xiaomu',b'dewei'))
print(b[:3])
print(b.find(b'x'))

print(dir(b))

运行结果：

/Users/llq/PycharmProjects/pythonlearn/pythonlearn/change/bin/python /Users/llq/PycharmProjects/pythonlearn/change/change_str_bytes.py 
hello xiaomu <class 'str'>
b'hello xiaomu' <class 'bytes'>
b'Hello xiaomu'
b'hello dewei'
b'hel'
6
['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'center', 'count', 'decode', 'endswith', 'expandtabs', 'find', 'fromhex', 'hex', 'index', 'isalnum', 'isalpha', 'isascii', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

进程已结束，退出代码为 0

dir函数可以查看变量的数据类型，都含有哪些属性和方法(函数）。

例2:

b = b'hello xiaomu'
print(b[3])

运行结果：

比特是一种二进制的数据流，所以当获取到某个索引的时候，每个索引只对应某个字符，所以比特会把这个字符转换成二进制的数据流形式（数字）。

例3:

c = b'hello 小慕'
print(c)

运行结果：

/Users/llq/PycharmProjects/pythonlearn/pythonlearn/change/bin/python /Users/llq/PycharmProjects/pythonlearn/change/change_str_bytes.py 
  File "/Users/llq/PycharmProjects/pythonlearn/change/change_str_bytes.py", line 16
    c = b'hello 小慕'
        ^
SyntaxError: bytes can only contain ASCII literal characters.

进程已结束，退出代码为 1

比特类型只支持有ascii标准的字符，也就是说只支持英文。

2.字符串转bytes的函数：encode

2.1功能

encode，字面意思是编码。encode属于字符串的内置函数。

将字符串转成比特（bytes）类型。

2.2用法

string：是将要转成比特类型的字符串。

encoding：需要按照哪个编码格式的标准进行编码。默认是utf-8。

errors：容错机制。默认是strict，代表如果编码出错了，就直接报错。而ignore代表可以忽略这个错误。

2.3代码

str_data = 'my name is dewei'
byte_data = str_data.encode('utf-8')
print(byte_data)

运行结果：

b'my name is dewei'

3.bytes转字符串的函数：decode

3.1功能

decode的词意是解码。

将比特（bytes）类型转成字符串。

decode函数在字符串的内置函数中并不存在。它仅仅存在于比特类型。

同时比特类型也没有encode函数，它只存在于字符串类型中。

3.2用法

bytes：是需要转成字符串的比特类型。

encoding：是使用哪种编码标准解码。

errors：容错机制。

3.3代码

byte_data = b'python is a good code'
str_data = byte_data.decode('utf-8')
print(str_data)

运行结果：

python is a good code

4.代码

c = 'hello 小慕'
d = c.encode('utf-8')
print(d,type(d))
print(d.decode('utf-8'))

运行结果：

/Users/llq/PycharmProjects/pythonlearn/pythonlearn/change/bin/python /Users/llq/PycharmProjects/pythonlearn/change/change_str_bytes.py 
b'hello \xe5\xb0\x8f\xe6\x85\x95' <class 'bytes'>
hello 小慕

进程已结束，退出代码为 0

无法通过b直接定义一个含有中文的比特类型。所以先定义出一个带中文的字符串。然后通过encode函数去转码。

从运行结果看到：小慕这两个字被一些看不懂的符号替代了。其实，通过encode，python已经将中文转成utf-8能读懂的中文的样子。并且现在的类型是比特。

注意：尽量编解码调用的编码标准要统一。