pytorch代码个人心得

最新推荐文章于 2025-11-12 07:00:00 发布

原创最新推荐文章于 2025-11-12 07:00:00 发布 · 1.2k 阅读

2 ·

CC 4.0 BY-SA版权

机器学习专栏收录该内容

7 篇文章

订阅专栏

本文详细解析PyTorch中交叉熵损失函数的工作原理，包括log_softmax和nll_loss的功能区别，以及如何在不使用softmax的情况下应用交叉熵损失。同时，介绍了torch.nn.functional和torch.nn.CrossEntropyLoss类的使用方法，以及在PyTorch中处理Variable和autograd求梯度的技巧。

nn.crossentropyloss()类包含两步函数: log_softmax和nllloss,(log-likelihood loss), 后者没有log步骤。

如果loss只想要交叉熵，不要softmax步骤，可以在网络最后加上nn.softmax层，以及torch.log()函数进行输出，训练的loss使用nn.NLLLoss()类。

torch.log()等运算不需要参提供参数，所以直接调用即可。

torch.nn.functional是函数，除了一般的输入输出，还需要输入计算所用的参数，所以不能直接调用。定义如下：

def cross_entropy(input, target, weight=None, size_average=None, ignore_index=-100,
                  reduce=None, reduction='elementwise_mean'):
    if size_average is not None or reduce is not None:
        reduction = _Reduction.legacy_get_string(size_average, reduce)
    return nll_loss(log_softmax(input, 1), target, weight, None, ignore_index, None, reduction)

为了使用torch.nn.functional更加方便，就有了类torch.nn.CrossEntropyLoss，这里init定义了需要的参数，在forward里调用torch.nn.functional, 这样就很好地解决了参数的保存问题，使得神经网络训练过程更加方便。

class CrossEntropyLoss(_WeightedLoss):
    def __init__(self, weight=None, size_average=None, ignore_index=-100,
                 reduce=None, reduction='elementwise_mean'):
        super(CrossEntropyLoss, self).__init__(weight, size_average, reduce, reduction)
        self.ignore_index = ignore_index

    def forward(self, input, target):
        return F.cross_entropy(input, target, weight=self.weight,
                               ignore_index=self.ignore_index, reduction=self.reduction)

在版本3.0.1中，Variable最好作为一个整体来做处理，而不要拆分，否则容易影响到梯度的计算。

比如，想对一个数组变量s的个别分量做倍乘等处理，可以单独设置一个和s一样尺寸的变量mask，初始化为1，把mask在s要处理的对应位置上设为倍乘倍率，然后点乘s*mask。这样做矩阵导数可以直接得到这一步对mask的导数为s。

pytorch中的autograd求梯度方式不会保存中间量（如x）的梯度，只会保存类初始化定义的变量（如self.conv(）参数）。

而hook函数可以记录中间的任意中间量，如CNN里面的feature map。 Grad_CAM里就是使用的这个函数。、

参见Pytorch中autograd以及hook函数详解