Python自学笔记D10——常用内建（datetime,collection,base64,struct,hashlib,hmac)）

最新推荐文章于 2023-03-24 20:13:17 发布

原创最新推荐文章于 2023-03-24 20:13:17 发布 · 470 阅读

0 ·

CC 4.0 BY-SA版权

python自学笔记专栏收录该内容

15 篇文章

订阅专栏

这篇博客介绍了Python中的几个常用内建模块，包括datetime用于处理日期和时间，collections中的nametuple、deque、defaultdict、OrderedDict和Counter等数据结构，base64的二进制数据转换，struct用于处理字节数据，hashlib和hmac则涉及摘要算法和密钥混合过程。文章详细讲解了各个模块的功能和用法，并给出了示例。

文章目录

常用内建模块

常用内建模块

datetime日期和时间

处理日期和时间

import datetime
now = datetime.datetime.now() # 获取当前datetime
print(now)
print(type(now))
dt = datetime(2015, 4, 19, 12, 20) # 用指定日期时间创建datetime
print(dt)
#结果
2020-06-26 14:40:42.509937
<class 'datetime.datetime'>
2015-04-19 12:20:00

timestamp

当前时间就是相对于epoch time的秒数，称为timestamp。（全球使用）

把一个datetime类型转换为timestamp只需要简单调用timestamp()方法
dt.datetime()

要把timestamp转换为datetime，使用datetime提供的fromtimestamp()方法：

datetime.fromtimestamp(t)#本地时间
datetime.utcfromtimestamp(t)

string互转datetime

 cday = datetime.strptime('2015-6-1 18:19:59', '%Y-%m-%d %H:%M:%S')
now = datetime.now()
print(now.strftime('%a, %b %d %H:%M'))

datetime加减

from datetime import datetime, timedelta
now = datetime.now()
now
datetime.datetime(2015, 5, 18, 16, 57, 3, 540997)#结果
now + timedelta(hours=10)
datetime.datetime(2015, 5, 19, 2, 57, 3, 540997)#结果

时区转换

# 拿到UTC时间，并强制设置时区为UTC+0:00:
utc_dt = datetime.utcnow().replace(tzinfo=timezone.utc)
print(utc_dt)
2015-05-18 09:05:12.377316+00:00#结果
# astimezone()将转换时区为北京时间:
bj_dt = utc_dt.astimezone(timezone(timedelta(hours=8)))
print(bj_dt)
2015-05-18 17:05:12.377316+08:00#结果

作业，给一个time和时区，将其转换。

import re
from datetime import datetime, timezone, timedelta
def to_timestamp(dt_str, tz_str):
    dt = datetime.strptime(dt_str, '%Y-%m-%d %H:%M:%S')
    hour = re.match(r'\w+([\+-]\d+)\:\d+', tz_str).group(1)
    tz = timezone(timedelta(hours=int(hour)))
    dt = dt.replace(tzinfo=tz)
    return dt.timestamp()

collections 常用函数库

nametuple

为tuple取名，并规定数量，
即定义一种数据类型。

>>> from collections import namedtuple
>>> Point = namedtuple('Point', ['x', 'y'])
>>> p = Point(1, 2)
>>> p.x
1

deque
使用list存储数据时，按索引访问元素很快，但是插入和删除元素就很慢了，因为list是线性存储，数据量大的时候，插入和删除效率很低。

deque是为了高效实现插入和删除操作的双向列表，适合用于队列和栈：

append()末尾插入对应pop()
appendleft()首部插入对应popleft()

defaultdict
同dict
错误时返回指定函数！

OrderedDict

实现按照输入的先后进行排序，一般dict无序。

以下为一个先进先出（fifo）的dict

from collections import OrderedDict

class LastUpdatedOrderedDict(OrderedDict):

    def __init__(self, capacity):
        super(LastUpdatedOrderedDict, self).__init__()
        self._capacity = capacity

    def __setitem__(self, key, value):
        containsKey = 1 if key in self else 0
        if len(self) - containsKey >= self._capacity:
            last = self.popitem(last=False)
            print('remove:', last)
        if containsKey:
            del self[key]
            print('set:', (key, value))
        else:
            print('add:', (key, value))
        OrderedDict.__setitem__(self, key, value)

ChainMap

ChainMap可以把一组dict串起来并组成一个逻辑上的dict。ChainMap本身也是一个dict，但是查找的时候，会按照顺序在内部的dict依次查找。

应用程序往往都需要传入参数，参数可以通过命令行传入，可以通过环境变量传入，还可以有默认参数。我们可以用ChainMap实现参数的优先级查找，即先查命令行参数，如果没有传入，再查环境变量，如果没有，就使用默认参数。

下面为chainmap的具体解释应用

def collection_test2():
    import builtins
    from collections import ChainMap
    a = {"name": "leng"}
    b = {"age": 24}
    c = {"wife": "qian"}
    pylookup = ChainMap(a,b,c)
    print(pylookup)
    print(pylookup['age'])
    print(pylookup.maps)
    pylookup.update({"age": 25})
    print(pylookup)
    b['age'] = 26
    print(pylookup)
    print(type(pylookup.maps))
    pylookup.maps[0]['age']=20
    pylookup.maps[1]['age']=22
    print(pylookup)
    print("-----------")
    d = {"name": "leng"}
    e = {"name":"123"}
    cm = ChainMap(d,e)
    print(cm)
    print(cm['name'])
collection_test2()
#结果
ChainMap({'name': 'leng'}, {'age': 24}, {'wife': 'qian'})
24
[{'name': 'leng'}, {'age': 24}, {'wife': 'qian'}]
ChainMap({'name': 'leng', 'age': 25}, {'age': 24}, {'wife': 'qian'})
ChainMap({'name': 'leng', 'age': 25}, {'age': 26}, {'wife': 'qian'})
<class 'list'>
ChainMap({'name': 'leng', 'age': 20}, {'age': 22}, {'wife': 'qian'})
-----------
ChainMap({'name': 'leng'}, {'name': '123'})
leng#一样则先来后到

https://blog.youkuaiyun.com/langb2014/article/details/100122209
具体应用

Counter
一个简单的计数器。

base64 二进制数据转换

Base64是一种用64个字符来表示任意二进制数据的方法。

Base64编码会把3字节的二进制数据编码为4字节的文本数据，长度增加33%，好处是编码后的文本数据可以在邮件正文、网页等直接显示。

如果要编码的二进制数据不是3的倍数，最后会剩下1个或2个字节怎么办？Base64用\x00字节在末尾补足后，再在编码的末尾加上1个或2个=号，表示补了多少字节，解码的时候，会自动去掉。

但是由于=在URL等中有歧义，所以会自动去掉=
作业：写一个能处理去掉=的解码函数

import base64
def safe_base64_decode(s):
    if len(s) % 4 == 0:
        return base64.b64decode(s)
    else:
        s = s +b'='
        return safe_base64_decode(s)

struct 处理字节数据

struct的pack函数把任意数据类型变成bytes

struct.pack('>I', 10240099)

作业：检查位图，输出宽度等信息。

def bmp_info(data):
    d = struct.unpack('<ccIIIIIIHH', data[:30])
    if d[0] + d[1] == b'BM' or b'BA':
        return {
            'width': d[6],
            'height': d[7],
            'color': d[9]
        }
    else:
        raise TypeError('file is bad')

hasilib摘要算法

摘要算法又称哈希算法、散列算法。它通过一个函数，把任意长度的数据转换为一个长度固定的数据串（通常用16进制的字符串表示）。

摘要算法就是通过摘要函数f()对任意长度的数据data计算出固定长度的摘要digest，目的是为了发现原始数据是否被人篡改过。（通过摘要反推极为困难）

SHA1和MD5较为常见，一般用于密码存储
判断密码是否正确

def login(user, password):
    md5 = hashlib.md5()
    md5.update(password.encode('utf-8'))
    if db[user] == md5.hexdigest():
        return True
#第二种
def login(user, password):
    return user in db.keys() and db[user] == hashlib.md5(password.encode('utf-8')).hexdigest()

由于常用口令的MD5值很容易被计算出来，所以，要确保存储的用户口令不是那些已经被计算出来的常用口令的MD5，这一方法通过对原始口令加一个复杂字符串来实现，俗称“加盐”：

db = {}
def register(username, password):
    db[username] = get_md5(password + username + 'the-Salt')

通常我们计算MD5时采用md5(message + salt)

hmc 把key混入过程

计算一段message的哈希时，根据不同口令计算出不同的哈希。要验证哈希值，必须同时提供正确的口令。

和我们自定义的加salt算法不同，Hmac算法针对所有哈希算法都通用，无论是MD5还是SHA-1。采用Hmac替代我们自己的salt算法，可以使程序算法更标准化，也更安全。

需要注意传入的key和message都是bytes类型，str类型需要首先编码为bytes。

标准hmc算法：

import hmac, random

def hmac_md5(key, s):
    return hmac.new(key.encode('utf-8'), s.encode('utf-8'), 'MD5').hexdigest()

class User(object):
    def __init__(self, username, password):
        self.username = username
        self.key = ''.join([chr(random.randint(48, 122)) for i in range(20)])
        self.password = hmac_md5(self.key, password)

db = {
    'michael': User('michael', '123456'),
    'bob': User('bob', 'abc999'),
    'alice': User('alice', 'alice2008')
}
def login(username, password):
    user = db[username]
    return user.password == hmac_md5(user.key, password)
#验证
assert login('michael', '123456')
assert login('bob', 'abc999')
assert login('alice', 'alice2008')
assert not login('michael', '1234567')
assert not login('bob', '123456')
assert not login('alice', 'Alice2008')
print('ok')