python小工具库收集

原创已于 2023-02-16 23:10:18 修改 · 199 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#python

于 2023-02-10 21:37:56 首次发布

各种速查表专栏收录该内容

7 篇文章

订阅专栏

文章介绍了如何利用EasyDict配置超参数，通过skorch库训练神经网络，包括示例代码展示了如何构建和训练自定义的PyTorch模块。此外，还提到了skorch在NLP预训练大模型中的应用。最后，文章讲解了argparse模块在参数解析中的作用，给出了一段处理命令行参数的示例。

1、EasyDict配置超参数

from easydict import EasyDict

config = EasyDict()

config.learn_rate = 0.01
config.epoch = 100

2、训练神经网络工具skorch

https://skorch.readthedocs.io/en/stable/

import numpy as np
from sklearn.datasets import make_classification
from torch import nn

from skorch import NeuralNetClassifier


X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X = X.astype(np.float32)
y = y.astype(np.int64)

class MyModule(nn.Module):
    def __init__(self, num_units=10, nonlin=nn.ReLU()):
        super().__init__()

        self.dense0 = nn.Linear(20, num_units)
        self.nonlin = nonlin
        self.dropout = nn.Dropout(0.5)
        self.dense1 = nn.Linear(num_units, num_units)
        self.output = nn.Linear(num_units, 2)

    def forward(self, X, **kwargs):
        X = self.nonlin(self.dense0(X))
        X = self.dropout(X)
        X = self.nonlin(self.dense1(X))
        X = self.output(X)
        return X


net = NeuralNetClassifier(
    MyModule,
    max_epochs=10,
    criterion=nn.CrossEntropyLoss(),
    lr=0.1,
    # Shuffle training data on each epoch
    iterator_train__shuffle=True,
)

net.fit(X, y)
y_proba = net.predict_proba(X)

辅助训练NLP预训练大模型

from skorch.hf import HuggingfacePretrainedTokenizer
# pass the model name to be downloaded
hf_tokenizer = HuggingfacePretrainedTokenizer('bert-base-uncased')
data = ['hello there', 'this is a text']
hf_tokenizer.fit(data)  # only loads the model
hf_tokenizer.transform(data)

# use hyper params from pretrained tokenizer to fit on own data
hf_tokenizer = HuggingfacePretrainedTokenizer(
    'bert-base-uncased', train=True, vocab_size=12345)
data = ...
hf_tokenizer.fit(data)  # fits new tokenizer on data
hf_tokenizer.transform(data)

3、参数解析工具argparse

import argparse


parser = argparse.ArgumentParser()
parser.add_argument('--dataset', type=str, default='Cora')
parser.add_argument('--hidden_channels', type=int, default=8)
parser.add_argument('--heads', type=int, default=8)
parser.add_argument('--lr', type=float, default=0.005)
parser.add_argument('--epochs', type=int, default=200)
parser.add_argument('--wandb', action='store_true', help='Track experiment')
args = parser.parse_args()