PyTorch：nn模型层container_self.nn-优快云博客

本文链接：https://blog.youkuaiyun.com/pipisorry/article/details/109192065

本文详细介绍了PyTorch中的容器类，包括nn.Module、nn.ModuleList、nn.Sequential及nn.ModuleDict的使用方法与区别。通过示例展示了如何构建复杂的神经网络结构。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

概述

模型层的命名方式

命名方式 {自动识别名称如bn_layers}.0.weight

if self.use_bn:
self.bn_layers = nn.ModuleList(
[nn.BatchNorm1d(hidden_units[i + 1]) for i in range(len(hidden_units) - 1)])

查看模型每层输出详情

Keras有一个简洁的API来查看模型的每一层输出尺寸，这在调试网络时非常有用。

在PyTorch中也可以实现这个功能。
使用很简单，如下用法：
from torchsummary import summary
summary(your_model, input_size=(channels, H, W))
input_size 是根据你自己的网络模型的输入尺寸进行设置。

容器类及其区别

nn.Module, nn.ModuleList, nn.Sequential，这些类我们称之为容器 (containers)，因为我们可以添加模块 (module) 到它们之中。

如果你确定 nn.Sequential 里面的顺序是你想要的，而且不需要再添加一些其他处理的函数 (比如 nn.functional 里面的函数)，那么完全可以直接用 nn.Sequential。这么做的代价就是失去了部分灵活性，毕竟不能自己去定制 forward 函数里面的内容了。

使用Module：当您有由多个小块组成的大块时
使用Sequential：想要从层创建小块时使用
使用ModuleList：当您需要遍历某些层或构建块并执行某些操作时
使用ModuleDict：当您需要参数化机器学习模型的某些块时使用，例如激活函数

[PyTorch 中的 ModuleList 和 Sequential: 区别和使用场景]

Module：主要构建块

Module是主要构建块，它定义了所有神经网络的基类。

CLASStorch.nn.Module(*args, **kwargs)

方法

apply(fn)

Applies fn recursively to every submodule (as returned by .children()) as well as self. Typical use includes initializing the parameters of a model (see also torch.nn.init).

>>> @torch.no_grad()
>>> def init_weights(m):
>>> if type(m) == nn.Linear:
>>> m.weight.fill_(1.0)
>>> net = nn.Sequential(nn.Linear(2, 2), nn.Linear(2, 2))
>>> net.apply(init_weights)

示例

import torch.nn as nn
import torch.nn.functional as F

class Model(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(1, 20, 5)
        self.conv2 = nn.Conv2d(20, 20, 5)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        return F.relu(self.conv2(x))

nn.Sequential: stack和erge层

Sequential是一个模块的容器，可以stacked在一起并同时运行。

它实现了内部的 forward 函数，而且里面的模块必须是按照顺序进行排列的，所以我们必须确保前一个模块的输出大小和下一个模块的输入大小是一致的。

示例

class net5(nn.Module):
    def __init__(self):
        super(net5, self).__init__()
        self.block = nn.Sequential(nn.Conv2d(1,20,5),
                                    nn.ReLU(),
                                    nn.Conv2d(20,64,5),
                                    nn.ReLU())
    def forward(self, x):
        x = self.block(x)
        return x

net = net5()
print(net)

nn.ModuleList：我们需要迭代的时候

CLASStorch.nn.ModuleList(modules=None)，代码实现上是class ModuleList(Module)继承了Module方法。

Holds submodules in a list. ModuleList can be indexed like a regular Python list, but modules it contains are properly registered, and will be visible by all Module methods.

ModuleList允许您存储Module为列表。当您需要遍历层并存储/使用某些信息（如U-net）时，它非常有用。和Sequential的主要区别在于ModuleList没有forward 方法，因此内部层没有连接。

对于 nn.ModuleList 这个类，你可以把任意 nn.Module 的子类 (比如 nn.Conv2d, nn.Linear 之类的) 加到这个 list 里面，方法和 Python 自带的 list 一样，无非是 extend，append 等操作。但不同于一般的 list，加入到 nn.ModuleList 里面的 module 是会自动注册到整个网络上的，同时 module 的 parameters 也会自动添加到整个网络中（即net.parameters()这个是有相应值的）。但是nn.ModuleList 并没有定义一个网络，它只是将不同的模块储存在一起，这些模块之间并没有什么先后顺序可言，网络的执行顺序是根据 forward 函数来决定的。

示例1

class MyModule(nn.Module):
    def __init__(self):
        super().__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        # ModuleList can act as an iterable, or be indexed using ints
        for i, l in enumerate(self.linears):
            x = self.linears[i // 2](x) + l(x)
        return x

示例2

def __init__(self, config):
    super().__init__(config)

    self.layers = torch.nn.ModuleList()
    for i in range(1, len(self.hidden_dimensions)):
        self.layers.append(
            torch.nn.Sequential(
                torch.nn.Linear(self.hidden_dimensions[i-1], self.hidden_dimensions[i]),
                torch.nn.ReLU(),
                torch.nn.BatchNorm1d(self.hidden_dimensions[i]),
                torch.nn.Dropout(p=0.5)
            ))

    self.layers.apply(self._init_weight)
    
def _init_weight(self, m):
    if isinstance(m, torch.nn.Linear):
        torch.nn.init.normal_(m.weight, std=0.1)

示例3

import torch
from torch import nn
class MLP(nn.Module):
    def __init__(self, m, n, ln=2):
        super(MLP, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(m, n) for _ in range(ln)])

    def forward(self, x):
        for module in self.linears:
            x = module(x)
        return x
x = torch.tensor([[1, 2], [3, 4], [5, 6]]).float()
mlp1 = MLP(2, 2)
y = mlp1(x)
print(y)

ModuleList和Sequential结合的示例

import torch
class Classifier(torch.nn.Module):
    def __init__(self):
        super().__init__()
        self.layers = torch.nn.ModuleList()
        for i in range(3):
            self.layers.append(
                torch.nn.Sequential(
                    torch.nn.Linear(1024, 128),
                    torch.nn.ReLU(),
                    torch.nn.BatchNorm1d(128),
                    torch.nn.Dropout(p=0.2)
                ))

    def forward(self, batch):
        for layer in self.layers:
            layer_activation = layer(batch)
        # ...

nn.ModuleDict：我们需要选择的时候

可以使用ModuleDict创建一个Module字典，并在需要时动态切换。

示例1

activations = nn.ModuleDict([
    ['relu', nn.ReLU()],
    ['hardtanh', nn.Hardtanh()],
    ['relu6', nn.ReLU6()],
    ['sigmoid', nn.Sigmoid()],
    ['tanh', nn.Tanh()],
    ['softmax', nn.Softmax()],
    ['softmax2d', nn.Softmax2d()],
    ['logsoftmax', nn.LogSoftmax()],
    ['elu', nn.ELU()],
    ['selu', nn.SELU()],
    ['celu', nn.CELU()],
    ['hardshrink', nn.Hardshrink()],
    ['leakyrelu', nn.LeakyReLU()],
    ['logsigmoid', nn.LogSigmoid()],
    ['softplus', nn.Softplus()],
    ['softshrink', nn.Softshrink()],
    ['prelu', nn.PReLU()],
    ['softsign', nn.Softsign()],
    ['softmin', nn.Softmin()],
    ['tanhshrink', nn.Tanhshrink()],
    ['rrelu', nn.RReLU()],
    ['glu', nn.GLU()],
])

activation1 = activations['relu']

示例2

    embedding_dict = nn.ModuleDict(
        {key: nn.Embedding(vocab_size, ope_config.embed_size if not linear else 1, sparse=sparse)
         for key, vocab_size, w2i, i2w in lang}
    )

示例3

loss = nn.ModuleDict([
    ['l1', nn.L1Loss()],
    ['nll', nn.NLLLoss()],
    ['kldiv', nn.KLDivLoss()],
    ['mse', nn.MSELoss()],
    ['bce', nn.BCELoss()],
    ['bce_with_logits', nn.BCEWithLogitsLoss()],
    ['cosine_embedding', nn.CosineEmbeddingLoss()],
    ['ctc', nn.CTCLoss()],
    ['hinge_embedding', nn.HingeEmbeddingLoss()],
    ['margin_ranking', nn.MarginRankingLoss()],
    ['multi_label_margin', nn.MultiLabelMarginLoss()],
    ['multi_label_soft_margin', nn.MultiLabelSoftMarginLoss()],
    ['multi_margin', nn.MultiMarginLoss()],
    ['smooth_l1', nn.SmoothL1Loss()],
    ['soft_margin', nn.SoftMarginLoss()],
    ['cross_entropy', nn.CrossEntropyLoss()],
    ['triplet_margin', nn.TripletMarginLoss()],
    ['poisson_nll', nn.PoissonNLLLoss()]
])