今天学李沐老师的动手学深度学习的多层感知机,在学习过程中,代码运行出现了报错:
Traceback (most recent call last):
File "D:\zmm\pycharm project\pythonProject\study1\gzj1.py", line 28, in <module>
d2l.train_ch3(net,train_iter,test_iter,loss,num_epochs,updater)
File "D:\zmm\pycharm project\pythonProject\study1\d2l\torch.py", line 335, in train_ch3
train_metrics = train_epoch_ch3(net, train_iter, loss, updater)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\zmm\pycharm project\pythonProject\study1\d2l\torch.py", line 273, in train_epoch_ch3
l = loss(y_hat, y)
^^^^^^^^^^^^^^
File "D:\Environment\python\Lib\site-packages\torch\nn\modules\loss.py", line 1183, in __init__
super().__init__(weight, size_average, reduce, reduction)
File "D:\Environment\python\Lib\site-packages\torch\nn\modules\loss.py", line 30, in __init__
super().__init__(size_average, reduce, reduction)
File "D:\Environment\python\Lib\site-packages\torch\nn\modules\loss.py", line 23, in __init__
self.reduction: str = _Reduction.legacy_get_string(size_average, reduce)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\Environment\python\Lib\site-packages\torch\nn\_reduction.py", line 35, in legacy_get_string
if size_average and reduce:
^^^^^^^^^^^^
RuntimeError: Boolean value of Tensor with more than one value is ambiguous
出现错误有以下两种原因:
一是因为我们的损失过大,我们可以在初始化w的时候不该用标准正态(方差太大), 应该用均值为0,方差为0.01的正态分布,这样损失就会降低。
二是因为在nn.CrossEntropyLoss(),参数reduction默认为"mean",表示对所有样本的loss取均值,最终返回只有一个值;参数reduction取"none",表示保留每一个样本的loss,这里是描点绘图,自然需要记录每一个样本的loss,所以应将参数reduction设置为"none"。
下面为改正后的代码:
import torch
from torch import nn
from d2l import torch as d2l
batch_size=256
train_iter,test_iter=d2l.load_data_fashion_mnist(batch_size)
num_inputs,num_outputs,num_hiddens=784,10,256
w1=nn.Parameter(torch.normal(mean=0,std=0.01,size=(num_inputs,num_hiddens),requires_grad=True))
b1=nn.Parameter(torch.zeros(num_hiddens,requires_grad=True))
w2=nn.Parameter(torch.normal(mean=0,std=0.01,size=(num_hiddens,num_outputs),requires_grad=True))
b2=nn.Parameter(torch.zeros(num_outputs,requires_grad=True))
params=[w1,b1,w2,b2]
def relu(x):
a=torch.zeros_like(x)
return torch.max(x,a)
def net(x):
x=x.reshape((-1,num_inputs))
h=relu(x@w1+b1)
return (h@w2+b2)
loss=nn.CrossEntropyLoss(reduction='none')
num_epochs,lr=10,0.1
updater=torch.optim.SGD(params,lr=lr)
d2l.train_ch3(net,train_iter,test_iter,loss,num_epochs,updater)
d2l.plt.show()
运行结果如图所示:
深度学习小白,若有错误,恳请各位大佬不吝赐教!