Torch-nn学习: Simpley Layer

最新推荐文章于 2025-10-12 23:24:35 发布

原创最新推荐文章于 2025-10-12 23:24:35 发布 · 4.7k 阅读

1 ·

CC 4.0 BY-SA版权

文章标签：

#torch #simple layer

Torch 专栏收录该内容

2 篇文章

订阅专栏

本文深入探讨Torch-nn库中的Simple Layer，重点介绍Linear层的工作原理，即线性变换y = Ax + b。通过这个基本的数学公式，解释如何在神经网络中进行特征线性组合和偏置添加。

部署运行你感兴趣的模型镜像

1.Linear:y = Ax + b

module = nn.Linear(inputDimension, outputDimension, [bias = true])

 module = nn.Linear(10, 5)  -- 10 inputs, 5 outputs

 print(module.weight)	//W
 print(module.bias)	//b

 print(module.gradWeight)
 print(module.gradBias)

2.SparseLinear:y = Ax + b

module = nn.SparseLinear(10000, 2)  -- 10000 inputs, 2 outputs

x = torch.Tensor({ {1, 0.1}, {2, 0.3}, {10, 0.3}, {31, 0.2} })	//dim not larger than 10000

3.Bilinear:forall k: y_k = x_1 A_k x_2 + b

module = nn.Bilinear(inputDimension1, inputDimension2, outputDimension, [bias = true])

4.Dropout:前向跟后向作用于相同位置，缩放了1/(1 - P)

module = nn.Dropout(p)

> module:forward(x)
  0   4   0   0
 10  12   0  16
[torch.DoubleTensor of dimension 2x4]

> module:backward(x, x:clone():fill(1))
 0  2  0  0
 2  2  0  2
[torch.DoubleTensor of dimension 2x4]

5.SpatialDropout:

module = nn.SpatialDropout(p)

对于比较靠前的卷积层，同一个feature map内的相邻像素关联较大，iid dropout没有好的正则化作用，因此用这个方法可以使一整个子区域同时失效或激活。这个操作假设最右边的两个维度是空间。

6.abs，Add,CAdd,Mul,CMul,Max,Min,Mean

module = nn.Mean(dimension, nInputDim)

module = nn.Sum(dimension, nInputDim, sizeAverage)

7.Euclidean,Cosine

module = nn.Euclidean(inputSize,outputSize)

module = nn.WeightedEuclidean(inputSize,outputSize)

module = nn.Cosine(inputSize,outputSize)

module = nn.Identity()

combine with ParallelTable;

pred_mlp = nn.Sequential()  -- A network that makes predictions given x.
pred_mlp:add(nn.Linear(5, 4))
pred_mlp:add(nn.Linear(4, 3))

xy_mlp = nn.ParallelTable() -- A network for predictions and for keeping the
xy_mlp:add(pred_mlp)        -- true label for comparison with a criterion
xy_mlp:add(nn.Identity())   -- by forwarding both x and y through the network.//just like short-cut

mlp = nn.Sequential()       -- The main network that takes both x and y.
mlp:add(xy_mlp)             -- It feeds x and y to parallel networks;
cr = nn.MSECriterion()
cr_wrap = nn.CriterionTable(cr)
mlp:add(cr_wrap)            -- and then applies the criterion.

for i = 1, 100 do           -- Do a few training iterations
   x = torch.ones(5)        -- Make input features.
   y = torch.Tensor(3)
   y:copy(x:narrow(1,1,3))  -- Make output label.
   err = mlp:forward{x,y}   -- Forward both input and output.
   print(err)               -- Print error from criterion.

   mlp:zeroGradParameters() -- Do backprop...
   mlp:backward({x, y})
   mlp:updateParameters(0.05)
end

module = nn.Copy(inputType, outputType, [forceCopy, dontCast])

module = nn.Narrow(dimension, offset, length)

> x = torch.rand(4, 5)

> x
 0.3695  0.2017  0.4485  0.4638  0.0513
 0.9222  0.1877  0.3388  0.6265  0.5659
 0.8785  0.7394  0.8265  0.9212  0.0129
 0.2290  0.7971  0.2113  0.1097  0.3166
[torch.DoubleTensor of size 4x5]

> nn.Narrow(1, 2, 3):forward(x)
 0.9222  0.1877  0.3388  0.6265  0.5659
 0.8785  0.7394  0.8265  0.9212  0.0129
 0.2290  0.7971  0.2113  0.1097  0.3166
[torch.DoubleTensor of size 3x5]

10.

> x = torch.linspace(1, 5, 5)
 1
 2
 3
 4
 5
[torch.DoubleTensor of dimension 5]

> m = nn.Replicate(3)
> o = m:forward(x)
 1  2  3  4  5
 1  2  3  4  5
 1  2  3  4  5

11.

> print(x)

  1   2   3   4
  5   6   7   8
  9  10  11  12
 13  14  15  16
[torch.Tensor of dimension 4x4]

> print(nn.Reshape(2,8):forward(x))

  1   2   3   4   5   6   7   8
  9  10  11  12  13  14  15  16
[torch.Tensor of dimension 2x8]

12.View,size = -1 for using mini-batch

module = nn.View(sizes)

> input = torch.Tensor(2, 3)
> minibatch = torch.Tensor(5, 2, 3)
> m = nn.View(-1):setNumInputDims(2)
> print(#m:forward(input))

 6
[torch.LongStorage of size 1]

> print(#m:forward(minibatch))

 5
 6
[torch.LongStorage of size 2]

13.Select:选择第几个dim的第几个index

module = nn.Select(dim, index)

14.Squeeze:若指定维度，则把对应维度压缩，否则压缩所有维度为1的维度。

module = nn.Squeeze([dim, numInputDims])

x=torch.rand(2,1,2,1,2)
> x
(1,1,1,.,.) =
  0.6020  0.8897

(2,1,1,.,.) =
  0.4713  0.2645

(1,1,2,.,.) =
  0.4441  0.9792

(2,1,2,.,.) =
  0.5467  0.8648

> torch.squeeze(x,2)
(1,1,.,.) =
  0.6020  0.8897

(2,1,.,.) =
  0.4713  0.2645

(1,2,.,.) =
  0.4441  0.9792

(2,2,.,.) =
  0.5467  0.8648
[torch.DoubleTensor of dimension 2x2x1x2]

还有：

module = nn.Unsqueeze(pos [, numInputDims])

15.

module = nn.Transpose({dim1, dim2} [, {dim3, dim4}, ...])

module = nn.Clamp(min_value, max_value)

module = nn.Normalize(p, [eps])

module = nn.MM(transA, transB)

module = nn.BatchNormalization(N [, eps] [, momentum] [,affine])

module = nn.PixelShuffle(r)／／used for super-resolution

module = nn.Padding(dim, pad [, nInputDim, value, index])

penalty = nn.L1Penalty(L1weight, sizeAverage)

Autoencoder会用：

encoder = nn.Sequential()
encoder:add(nn.Linear(3, 128))
encoder:add(nn.Threshold())
decoder = nn.Linear(128, 3)

autoencoder = nn.Sequential()
autoencoder:add(encoder)
autoencoder:add(nn.L1Penalty(l1weight))
autoencoder:add(decoder)

criterion = nn.MSECriterion()  -- To measure reconstruction error

module = nn.GradientReversal([lambda = 1])

16.

gpu = nn.GPU(module, device, [outdevice])
require 'cunn'
gpu:cuda()

eg:单块卡内存不够，可以这么用

mlp = nn.Sequential()
   :add(nn.GPU(nn.Linear(10000,10000), 1))
   :add(nn.GPU(nn.Linear(10000,10000), 2))
   :add(nn.GPU(nn.Linear(10000,10000), 3))
   :add(nn.GPU(nn.Linear(10000,10000), 4, cutorch.getDevice()))