Python学习02

最新推荐文章于 2025-09-10 17:59:49 发布

weixin_30879833

最新推荐文章于 2025-09-10 17:59:49 发布

阅读量59

点赞数

CC 4.0 BY-SA版权

文章标签： python

原文链接：http://www.cnblogs.com/yunhaoguo/articles/9290995.html

本文详细介绍了ASCII、Unicode、UTF-8及GBK等编码的区别，探讨了不同编码间转换产生的乱码问题，并通过Python示例展示了字符串与字节类型的转换过程。

编码

01010100 新
11010000 开
11010100 一
01100000 家
11000000 看
11000000 看

01010100011101110101011110110
A B C
01000001 01000010 01000011
电报，电脑的传输，存储都是01010101

最早的'密码本' ascii 涵盖了英文字母大小写，特殊字符，数字。
01010101
ascii 只能表示256种可能，太少，
创办了万国码 unicode
16表示一个字符不行，32位表示一个字符。
A 01000001010000010100000101000001
B 01000010010000100100001001000010
我 01000010010000100100001001000010
Unicode 升级 utf-8 utf-16 utf-32
8位 = 1字节bytes
utf-8 一个字符最少用8位去表示，英文用8位一个字节
欧洲文字用16位去表示两个字节
中文用24 位去表示三个字节
utf-16 一个字符最少用16位去表示

gbk 中国人自己发明的，一个中文用两个字节 16位去表示。

11000000

1bit 8bit = 1bytes
1byte 1024byte = 1KB
1KB 1024kb = 1MB
1MB 1024MB = 1GB
1GB 1024GB = 1TB

'''
ascii
    A : 00000010 8位 一个字节
unicode
    A : 00000000 00000001 00000010 00000100 32位 四个字节
    中: 00000000 00000001 00000010 00000110 32位 四个字节
utf-8:
    A : 0010 0000 8位 一个字节
    中: 00000001 00000010 00000110 24位 三个字节
gbk
    A : 00000110 8位 一个字节
    中: 00000010 00000110 16位 两个字节
1. 各个编码之间的二进制是不能互相识别的，会产生乱码。
2. 文件的储存，传输不能是unicode, 只能是utf-8 utf-16 gbk ascii等

python3:
    str 在内存中是用unicode编码的
    bytes类型 不是用Unicode编码，所以存储传输str的话要先转成bytes类型
    对于英文:
        str : 表现形式: s = 'alex'
              编码方式: 010101001... unicode
        bytes:表现形式: s = b'alex'
              编码方式: 00010010... utf-8 gbk等
    对于中文:
        str : 表现形式: s = '中国'
              编码方式: 010101001... unicode
        bytes:表现形式: s = b'x\e91\e91\e01\e21\e31\e32' \e91代表一个字节
              编码方式: 00010010... utf-8 gbk等
'''
#编码
s = 'abc'
s1 = b'abc'
print(s, type(s))
print(s1, type(s1))
# encode str -> bytes
s = 'alex'
s1 = s.encode('utf-8')
print(s1)
s = '郭云皓'
s2 = s.encode('utf-8')
print(s2)
# decode bytes -> str
s = s2.decode('utf-8')
print(s)

逻辑运算符

# 逻辑运算符
# # and or not
# # 优先级 () > not > and > or

print(2 > 1 and 1 < 4 or 2 < 4 and 3 < 2)

a = not 2 > 1
print(a)

print()
print("or")
#or x or y  if x True, 则返回x
print(1 or 2) 
print(5 or 3) 
print(1 or 0) 
print(0 or 100) 
print(0 or -1)
print(-1 or 2)
#and x and y if x True, 则返回y
print(1 and 2)
print(0 and 2)
print("test", 0 or 4 and 3 or 2)
print("test2", 1 > 2 and 3 or 4 and 3 < 2)
print("test3", 2 or 1 < 3 and 2)

#结果：1 5 1 100 -1 -1 2 0 
#test 3
#test2 False
#test3 2

格式化输出

# 格式化输出

name = "guo"
age = 23
height = 186

info = "我叫%s, 年龄%d, 身高%d" % (name, age, height)

print(info)


job = "Programmer"
hobbie = "Basketball"
info2 = '''
------------ info of %s -----------
Name  : %s
Age   : %d
job   : %s
Hobbie: %s
------------- end -----------------
''' % (name, name, age, job, hobbie)

print(info2)


msg = "%%"
print(msg)

# 一旦进行了格式化输出，要想输出百分号，需要额外加一个百分号来告诉编译器
msg2 = "我叫%s, 年龄%d, 身高%d, 学习进度为4%%" % (name, age, height)
print(msg2)

'''
进阶版
'''
print("进阶版")
str = '{} {} {}'.format('guo', 22, 'male')
print(str)
str = '{0} {1} {0}'.format('guo', 19, 'male')
print(str)
str = '{name} {age} {gender}'.format(name = 'guo', age = '19', gender = 'male')
print(str)

其他

# **乘方 //整除
if 4//2:
    print("能整除")

# python使得print不自动换行的方法
count = 0
while count < 5:
    print(count, end=" ")
    count = count + 1

print("\n")
count = 0
while count < 5:
    print(count, end="\n")
    count = count + 1


# pass代表跳过继续往下面代码进行，而continue是重新开始循环
count = 0
while count < 10:
    count += 1
    if count == 7:
        pass
    else:
        print(count)

print("bool <-> int")
print(int(True))
print(int(False))
print(bool(1))
print(bool(-3))
print(bool(0))

#结果 1 0 True True False

转载于:https://www.cnblogs.com/yunhaoguo/articles/9290995.html