Python小白学习教程从入门到入坑------第七课字符串编码与字符串常见操作（语法基础）

本文链接：https://blog.youkuaiyun.com/qq_64441210/article/details/142963435

一、字符串编码

1.1 字符串编码的简单了解

常用编码介绍一览表
编码	制定时间	作用	所占字节数
ASCII	1967年	表示英语及西欧语言	8bit/1bytes
GB2312	1980年	国家简体中文字符集，兼容ASCII	2bytes
Unicode	1991年	国家标准组织统一标准字符集	2bytes
GBK	1995年	GB2312的扩展字符集，支持繁体字，兼容GB2312	2bytes
UTF-8	1992年	不定长编码	1-3 bytes

字符串编码本质上就是二进制数据与语言文字的一一对应关系

Unicode：所有字符都是2个字节

优点：字符与数字之间转换速度更快一些

缺点：占用空间大

UTF-8: 精准，对不同的字符用不同的长度表示

优点：节省空间

缺点：字符与数字的转换速度较慢，每次都需要计算字符要用多少个字节来表示

1.2 字符串编码转换

编码：encode（）

将其他编码的字符串转换成Unicode编码

解码：decode（）

讲Unicode编码转换成其他编码的字符串

eg1:编码与解码

a = 'hello'
print(a,type(a))   # str, 字符串是以字符为单位进行处理
# 输出内容：hello <class 'str'>

a1 = a.encode()    # 编码
print("编码后：",a1)
# 输出内容：
# hello <class 'str'>
# 编码后： b'hello'

print(a1,type(a1))  # bytes,以字节为单位进行处理的
# 输出内容：
# hello <class 'str'>
# 编码后： b'hello'
# b'hello' <class 'bytes'>

a2 = a1.decode()   # 解码
print(a2,type(a2))
# 注意：对于bytes，只需要知道它跟字符串类型之间的互相转换
# 输出内容：
# hello <class 'str'>
# 编码后： b'hello'
# b'hello' <class 'bytes'>
# hello <class 'str'>

eg2:指定编码与指定解码

Lq = "奋斗"
Lq1 = Lq.encode("utf-8")
print(Lq1,type(Lq1))
# 输出内容：b'\xe5\xa5\x8b\xe6\x96\x97' <class 'bytes'>
Lq2 = Lq1.decode("utf-8")
print(Lq2,type(Lq2))
# 输出内容：
# b'\xe5\xa5\x8b\xe6\x96\x97

Python小白学习教程从入门到入坑------第七课 字符串编码与字符串常见操作（语法基础）

一、字符串编码

1.1 字符串编码的简单了解

1.2 字符串编码转换

Python小白学习教程从入门到入坑------第七课字符串编码与字符串常见操作（语法基础）