超简单的pytorch反向传播举例

最新推荐文章于 2025-04-29 16:25:48 发布

原创

最新推荐文章于 2025-04-29 16:25:48 发布 · 841 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#神经网络 #人工智能 #深度学习 #python

这篇博客通过一个简单的两层模型y=ax+b，展示了如何使用PyTorch进行反向传播。博主设定好输入x、目标y及初始参数a和b，采用MSE损失函数，仅运行一轮反向传播过程，详细解释了参数如何更新。读者可以设置断点，结合代码和图表深入理解反向传播机制。

本例设置了两层y=ax+b，前置条件包括x和目标y的值，初始的两组a和b的值，损失函数为mse loss，可以只跑一轮，看看反向传播各层参数是如何更新。建议在backward函数里打断点，然后将程序与图请对照着看。

在这里插入图片描述

# -*- coding:utf-8 -*-
# reference: https://pytorch.org/docs/stable/notes/extending.html
import torch
from torch import nn
from torch.autograd import Function, Variable
import numpy as np
from collections import OrderedDict

class LinearFunction2(Function):

    # Note that both forward and backward are @staticmethods
    @staticmethod
    # bias is an optional argument
    def forward(ctx, input, weight, bias=None):
        ctx.save_for_backward(input, weight, bias)
        output = input.mm(weight.t())
        if bias is not None:
            output += bias.unsqueeze(0).expand_as(output)
        return output

    # This function has only a single output, so it gets only one gradient
    @staticmethod
    def backward(ctx, grad_output):
        # This is a pattern that is very convenient - at the top of backward
        # unpack saved_tensors and initialize all gradients w.r.t. inputs to
        # None. Thanks to the fact that additional trailing Nones are
        # ignored, the return statement is simple even when the function has
        # optional inputs.

        # 第一个grad_output，是loss对y-y'的梯度。如loss_mse的梯度为2/n*(y-y'),
        # 其中n为一个batch数据的数目，该loss是nx1的矩阵。
        input