Computation in Pytorch

最新推荐文章于 2025-12-18 19:14:55 发布

翻译最新推荐文章于 2025-12-18 19:14:55 发布 · 144 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：https://d2l.ai/

文章标签：

#pytorch #深度学习 #神经网络

Pytorch 同时被 2 个专栏收录

5 篇文章

订阅专栏

python

4 篇文章

订阅专栏

本文详细介绍了PyTorch中计算、参数访问、参数初始化、参数共享以及自定义网络的方法。通过nn.Module和Sequential构建网络，并展示了如何遍历和初始化所有参数。同时，解释了参数共享时的梯度计算原则。此外，还提供了自定义网络层的示例，强调了forward函数的重要性。

部署运行你感兴趣的模型镜像

文章目录

Computation in Pytorch

Computation in Pytorch

记录一些 pytorch 中 Tensor 和 Parameter 使用方法。

Parameter

Parameter Access

每一层的参数存储在相应Parameter类的属性中。eg. weight、bias。

# 查看所有的参数和对应属性名称
print(net.state_dict())

net[2].bias # Parameter Class
net[2].bias.data = torch.randn(3,3) # 修改模型参数中的数值

注意，如果要直接操作数值，应该通过 .data 赋值修改。（或者说如果要对 auto_grad 中的张量进行赋值，而又要避免 inplace 错误，都要采用 .data 赋值方法）

All Paramters at once

nn.Module 提供 .named_parameters() 方法，返回迭代器，供遍历所有参数使用（比如初始化参数，设置是否需要梯度）。

for name, param in net.named_parameters():
	# name 是相应的 attribute 的名字
	# parameter operation
	param.data = ...

# you can also use name to directly access parameter
net.state_dict()[name].data # eg. name = '2.bias'

Parameter Initialization

def init_normal(m):
	"""
	判断传入 m 的类型，然后据此初始化模型。
	"""
	if type(m) == nn,Linear:
		nn.init.normal_(m.weight, std=0.01)

def my_init(m):
	if isinstance(m, nn.Linear):
		nn.init.uniform_(m.weight, -10, 10)
		m.weight.data *= m.weight.data.abs() >= 5 # `.data` 操作

net.apply(init_normal)

参数共享

神经网络中部分模块共享同一参数。由于 pytorch 是构建计算图，本质上使用同一个对象进行多次操作。

shared = nn.Linear(8, 8)
net = nn.Sequential(nn.Linear(4, 8), nn.ReLU(),
					shared, nn.ReLU(),
					shared, nn.ReLU())

对第二层和第三层计算梯度时，内存中实际的梯度（只有一份）是两者之和，因为 pytorch 计算梯度是相加累积的原则。

Module

自定义网络

自定义网络层，实现获取，初始化，保存，装载，共享参数。具体的参数创建可以看各个模块的初始化函数。

class MyLinear(nn.Module):
	def __init__(self, in_units, out_units):
		super().__init__()
		self.weight = nn.Parameter(torch.randn(in_uints, out_units))
		self.bias = nn.Parameter(torch.randn(out_uints,))
		
	def forward(self, x):
		linear = torch.matmul(x, self.weight.data) + self.bias.data
		return linear

nn.Module

Sequential

继承了 nn.Module 类

对于 nn.Module 中的 add_module 方法，则变为在尾部添加一个以 name 为属性的模块。并且按顺序执行。

# Example of using Sequential
model = nn.Sequential(
          nn.Conv2d(1,20,5),
          nn.ReLU(),
          nn.Conv2d(20,64,5),
          nn.ReLU()
        )

# Example of using Sequential with OrderedDict
model = nn.Sequential(OrderedDict([
          ('conv1', nn.Conv2d(1,20,5)),
          ('relu1', nn.ReLU()),
          ('conv2', nn.Conv2d(20,64,5)),
          ('relu2', nn.ReLU())
        ]))