1 张量
张量的英文是Tensor
,它是PyTorch里面基础的运算单位,与Numpy
的ndarray相同都表示的是一个多维的矩阵。 与ndarray的最大区别就是,PyTorch的Tensor可以在 GPU
上运行,而 numpy 的 ndarray 只能在 CPU 上运行,在GPU上运行大大加快了运算速度。
1.1 数据类型
Torch定义了八种CPU张量类型和八种GPU张量类型:
数据类型 | dtype | CPU张量 | GPU张量 |
---|---|---|---|
32位浮点 | torch.float32/float | torch.FloatTensor | torch.cuda.FloatTensor |
64位浮点 | torch.float64/double | torch.DoubleTensor | torch.cuda.DoubleTensor |
16位浮点 | torch.float16/half | torch.HalfTensor | torch.cuda.HalfTensor |
8位整型(无符号) | torch.uint8 | torch.ByteTensor | torch.cuda.ByteTensor |
8位整型(有符号) | torch.int8 | torch.CharTensor | torch.cuda.CharTensor |
16位整型(有符号) | torch.int16 | torch.ShortTensor | torch.cuda.ShortTensor |
32位整型(有符号) | torch.int32 | torch.IntTensor | torch.cuda.IntTensor |
32位整型(有符号) | torch.int64 | torch.LongTensor | torch.cuda.LongTensor |
其中,torch.Tensor
是默认张量(torch.FloatTensor
)的别名。
1.2 创建张量
1. torch.tensor(data, dtype = None, device = None, requires_grad = False)
该函数是tensor的构造函数,可以从Python的list
或array
构造张量:
import torch
a = torch.sensor([1., -1.], [1., -1.]) # from list
b = torch.sensor(np.array([[1, 2, 3], [4, 5, 6]])) # from np.array
print(a, b, sep = '\n')
tensor([[ 1., -1.],
[ 1., -1.]])
tensor([[1, 2, 3],
[3, 4, 5]], dtype=torch.int32)
2. torch.as_sensor(data, dtype = None, device = Node)
将数据转换为torch.Tensor,如果数据已经是一个具有相同dtype和device的Tensor,则不进行复制,相当于浅拷贝。否则会返回原数据的一个副本:
>>> a = np.array([1, 2, 3])
>>> t = torch.as_tensor(a)
>>> t
tensor([1, 2, 3], dtype=torch.int32)
>>> t[0] = -1
>>> a
array([-1, 2, 3])
>>> a = np.array([1, 2, 3])
>>> t = torch.as_tensor(a, device = torch.device('cuda'))
>>> t[0] = -1
>>> a
array([1, 2, 3])
3. torch.from_numpy(ndarray)
从numpy.ndarray创建Tensor,二者共享内存,在Tensor上的修改会反映到ndarray上,且Tensor无法resize:
>>> a = np.array([1, 2, 3])
>>> t = torch.from_numpy(a)
>>> t
tensor([1, 2, 3], dtype=torch.int32)
>>> t[0] = -1
>>> a
array([-1, 2, 3])
4. torch.zeros(*size, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
返回一个用标量0填充的张量,形状由size指定,和numpy.zeros不同的是,这里的size不需要是元组:
>>> a = torch.zeros(2, 3)
>>> a
tensor([[0., 0., 0.],
[0., 0., 0.]])
>>> b = np.zeros((2, 3))
>>> b
array([[0., 0., 0.],
[0., 0., 0.]])
注: 用法类似的有torch.ones()
、torch.empty()
和torch.full()
5. torch.zeros_like(input, dtype = None, layout = None, device = None, requires_grad = False)
返回一个由标量0填充的张量,形状和input相同:
>>> input = torch.empty(2, 3)
>>> torch.zeros_like(input)
tensor([[0., 0., 0.],
[0., 0., 0.]])
注: 用法类似的有torch.ones_like()
、torch.empty_like()
和torch.full_like()
6. torch.arange(start = 0, end, step = 1, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
返回一个长度为
⌊
e
n
d
−
s
t
a
r
t
s
t
e
p
⌋
\displaystyle \lfloor\frac{end-start}{step}\rfloor
⌊stepend−start⌋的一维张量,其中的值由区间
[
s
t
a
r
t
,
e
n
d
)
[start, end)
[start,end)插值得到:
>>> torch.arange(5)
tensor([0, 1, 2, 3, 4])
>>> torch.arange(1, 5)
tensor([1, 2, 3, 4])
>>> torch.arange(1, 2.5, 0.5)
tensor([1.0000, 1.5000, 2.0000])
7. torch.range(start = 0, end, step = 1, out = None, dtype = None, layout = torch.strided, divice = None, requires_grad = False)
返回一个长度为
⌊
e
n
d
−
s
t
a
r
t
s
t
e
p
⌋
+
1
\displaystyle \lfloor\frac{end-start}{step}\rfloor+1
⌊stepend−start⌋+1的一维张量:
>>> torch.range(1, 5)
__main__:1: UserWarning: torch.range is deprecated in favor of torch.arange and will be removed in 0.5. Note that arange generates values in [start; end), not [start; end].
tensor([1., 2., 3., 4., 5.])
注: 推荐使用torch.arange()
而不是torch.range()
。
8. torch.linspace(start, end, steps = 100, out = None, dtype = None, layout = torch.strided, device = None, required_grad = False)
返回一个均分分布的一维张量,起点为start, 终点为end(包含终点),点数为steps:
>>> a = torch.linspace(1, 100, 10)
>>> a
tensor([ 1., 12., 23., 34., 45., 56., 67., 78., 89., 100.])
9. torch.logspace(start, end, steps = 100, base = 10.0, out = None, dtype = None, layout = torch.strided, device = None, requiers_grad = False)
返回一个对数均匀分布的一维张量:
>>> torch.logspace(start = -10, end = 10, steps = 5)
tensor([1.0000e-10, 1.0000e-05, 1.0000e+00, 1.0000e+05, 1.0000e+10])
10. torch.eye(n, m = None, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
返回一个单位阵:
>>> torch.eye(2, 3)
tensor([[1., 0., 0.],
[0., 1., 0.]])
>>> torch.eye(3)
tensor([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])
1.3 张量操作
1. torch.cat(tensors, dim = 0, out = None)
张量拼接,除了要拼接的维度,其他维度必须相同:
>>> x = torch.randn(2, 3)
>>> torch.cat((x, x, x), 0)
tensor([[ 1.0102, -0.4265, 0.1250],
[-0.4309, -0.2024, -2.1630],
[ 1.0102, -0.4265, 0.1250],
[-0.4309, -0.2024, -2.1630],
[ 1.0102, -0.4265, 0.1250],
[-0.4309, -0.2024, -2.1630]])
>>> torch.cat((x, x, x), 1)
tensor([[ 1.0102, -0.4265, 0.1250, 1.0102, -0.4265, 0.1250, 1.0102, -0.4265,
0.1250],
[-0.4309, -0.2024, -2.1630, -0.4309, -0.2024, -2.1630, -0.4309, -0.2024,
-2.1630]])
2. torch.mask_select(input, mask, out = None)
根据掩模选择张量中的数据:
>>> x = torch.randn(3, 4)
>>> x
tensor([[-0.3417, -0.1746, 1.6381, -0.5856],
[ 0.0387, 0.6801, -0.7249, -1.8611],
[-0.4696, 0.1779, 1.7757, 0.3188]])
>>> mask = x.ge(0.5)
>>> mask
tensor([[0, 0, 1, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]], dtype=torch.uint8)
>>> torch.masked_select(x, mask)
tensor([1.6381, 0.6801, 1.7757])
3. torch.narrow(input, dimension, start, length)
根据维度选择数据形成张量,得到的张量与源张量共享内存:
>>> x = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> x
tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> torch.narrow(x, 0, 0, 2)
tensor([[1, 2, 3],
[4, 5, 6]])
>>> torch.narrow(x, 1, 1, 2)
tensor([[2, 3],
[5, 6],
[8, 9]])
4. torch.nonzero(input, out = None)
返回由非零值下标构成的张量:
>>> torch.nonzero(torch.tensor([1, 1, 1, 0, 1]))
tensor([[0],
[1],
[2],
[4]])
>>> torch.nonzero(torch.tensor([[0.6, 0.0, 0.0, 0.0],
... [0.0, 0.4, 0.0, 0.0],
... [0.0, 0.0, 1.2, 0.0],
... [0.0, 0.0, 0.0, -0.4]]))
tensor([[0, 0],
[1, 1],
[2, 2],
[3, 3]])
4. torch.reshape(input, shape)
reshape
>>> a = torch.arange(4)
>>> torch.reshape(a, (2, 2))
tensor([[0, 1],
[2, 3]])
>>> b = torch.tensor([[0, 1], [2, 3]])
>>> torch.reshape(b, (-1,))
tensor([0, 1, 2, 3])
5. torch.split(tensor, split_size_or_selections, dim = 0)
当split_size_or_selections为整数时,则将且分为等长的块;当split_size_or_selections为列表时,则根据列表的值进行切分:
>>> a = torch.arange(10)
>>> torch.split(a, 3)
(tensor([0, 1, 2]), tensor([3, 4, 5]), tensor([6, 7, 8]), tensor([9])) # 最后一段可能长度不够
>>> torch.split(a, [1, 2, 3, 4])
(tensor([0]), tensor([1, 2]), tensor([3, 4, 5]), tensor([6, 7, 8, 9]))
6. torch.squeeze(input, dim = None, out = None)
将长度为1的维度删掉。如果没有设置dim的值,则将所有的维度1删掉,否则删除指定的维度1。
>>> x = torch.zeros(2, 1, 2, 1, 2)
>>> x.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x)
>>> y.size()
torch.Size([2, 2, 2])
>>> y = torch.squeeze(x, 0) # 维度0大小不为1,无法删除
>>> y.size()
torch.Size([2, 1, 2, 1, 2])
>>> y = torch.squeeze(x, 1) # 维度1大小为1,删除
>>> y.size()
torch.Size([2, 2, 1, 2])
注: torch.unsqueeze()的效果和torch.squeeze()的效果相反。
7. torch.stack(seq, dim = 0, out = None)
对一系列size相同的张量拼接起来,维度增加一维用于拼接。
>>> x = torch.randn(3, 2)
>>> y = torch.randn(3, 2)
>>> z = torch.randn(3, 2)
>>> torch.stack((x, y, z))
tensor([[[-1.9550, -0.0732],
[-0.7633, 0.9965],
[ 1.2510, -0.0338]],
[[-1.0724, 0.4955],
[ 0.4289, 0.9300],
[-1.0098, 0.0298]],
[[ 2.3160, -0.8827],
[-1.8079, 0.9960],
[ 0.8421, 2.0886]]])
8. torch.t(input)
将维度<=2的张量进行转置,返回的张量与源张量共享内存:
>>> x = torch.randn(3, 2)
>>> y = torch.t(x)
>>> y.size()
torch.Size([2, 3])
9. torch.transpose(input, dim0, dim1)
对维度dim0和维度dim1进行转置:
>>> x = torch.empty(3, 2, 1, 4)
>>> x.size()
torch.Size([3, 2, 1, 4])
>>> y = torch.transpose(x, 1, 2)
>>> y.size()
torch.Size([3, 1, 2, 4])
10. torch.take(input, indices)
根据一系列索引选择张量中的值构成新的张量,新的张量维度和索引的维度相同:
>>> x = torch.randn(3, 2)
>>> x
tensor([[ 1.1186, -0.5884],
[ 0.0556, -0.2917],
[-0.7200, 0.1229]])
>>> torch.take(x, torch.tensor([[0, 1], [4, 5]]))
tensor([[ 1.1186, -0.5884],
[-0.7200, 0.1229]])
11. torch.unbind(tensor, dim = 0)
根据给定维度对张量进行拆分:
>>> torch.unbind(torch.tensor([[1, 2, 3],
... [2, 3, 4],
... [3, 4, 5]]))
(tensor([1, 2, 3]), tensor([2, 3, 4]), tensor([3, 4, 5]))
12. torch.where(condition, x, y)
根据condition的值选择x或者y中的值:
>>> x = torch.randn(3, 2)
>>> y = torch.ones(3, 2)
>>> x
tensor([[-0.7097, -0.0090],
[-0.4766, -0.1324],
[ 1.1632, -1.7332]])
>>> torch.where(x > 0, x, y)
tensor([[1.0000, 1.0000],
[1.0000, 1.0000],
[1.1632, 1.0000]])
1.4 随机数
1. torch.manual_seed(seed)
设置随机数种子
2. torch.bernoulli(input, *, generator = None, out = None)
生成0-1分布的随机数,input为伯努利分布的概率:
>>> a = torch.empty(3, 3).uniform_(0, 1)
>>> a
tensor([[0.6334, 0.8580, 0.1572],
[0.7853, 0.1458, 0.4177],
[0.3047, 0.0382, 0.5805]])
>>> torch.bernoulli(a)
tensor([[1., 1., 0.],
[1., 0., 0.],
[0., 0., 0.]])
>>> a = torch.ones(3, 3)
>>> torch.bernoulli(a)
tensor([[1., 1., 1.],
[1., 1., 1.],
[1., 1., 1.]])
3. torch.multinomial(input, num_samples, replacement = False, out = None)
多项式分布随机数,返回一个每行包含num_samples个索引的多项式分布随机数:
>>> weights = torch.tensor([0, 10, 3, 0], dtype = torch.float)
>>> torch.multinomial(weights, 2)
tensor([2, 1])
>>> torch.multinomial(weights, 4)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: invalid argument 2: invalid multinomial distribution (with replacement=False, not enough non-negative category to sample) at ..\aten\src\TH/generic/THTensorRandom.cpp:347
>>> torch.multinomial(weights, 4, replacement = True)
tensor([1, 1, 1, 1])
4. torch.nomal(mean, std, out = None)
正态分布随机数:
>>> torch.normal(mean = torch.arange(1., 11.), std = torch.arange(1, 0, -0.1))
tensor([-0.2438, 0.2975, 5.2781, 3.6607, 4.8195, 5.7053, 6.9796, 7.8922,
8.7863, 10.1156])
>>>
5. torch.rand(*sizes, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
均匀分布[0, 1):
>>> torch.rand(4)
tensor([0.2407, 0.6577, 0.7282, 0.9323])
>>> torch.rand(2, 3)
tensor([[0.9123, 0.3024, 0.3010],
[0.0817, 0.8514, 0.4296]])
注: torch.rand_like()
6. torch.randint(low = 0, high, size, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
生成[low, high)范围的整数随机值:
>>> torch.randint(3, 5, (3, ))
tensor([3, 4, 4])
>>> torch.randint(10, (2, 2))
tensor([[3, 6],
[2, 4]])
>>> torch.randint(3, 10, (2, 2))
tensor([[3, 8],
[4, 9]])
注: torch.randint_like()
7. torch.randn(*sizes, out = None, dtype = None, layout = torch.strided, device = None, requires_grad = False)
标准正太N(0, 1):
>>> torch.randn(4)
tensor([-1.4879, -0.5092, -2.2113, 0.4307])
>>> torch.randn(2, 3)
tensor([[-0.8514, -0.4009, 0.1524],
[ 0.9868, -0.1758, 1.6272]])
注: torch.randn_like()
8. torch.randperm(n, out = None, dtype = torch.int64, layout = torch.strided, device = None, requiers_grad = False)
生成0 - n-1的随机排列:
>>> torch.randperm(10)
tensor([3, 7, 5, 8, 6, 4, 9, 0, 2, 1])
其他
有关Tensor
类的其他函数可可参考tensor-creation-ops。
2 自动求导
PyTorch的自动求导包autograd
包为张量的所有操作提供自动微分功能,意味着我们在进行神经网络的反向传播时不必考虑求导过程,该过程可以利用.backward()
方法自动完成。
下面时一个最简单的例子,展示了对直线y = 3x + 4
求斜率的过程:
>>> x = torch.tensor(1., requires_grad = True)
>>> y = 3 * x + 4
>>> y.backward()
>>> x.grad
tensor(3.)
上例首先创建一个标量x,并令y = 3x + 4
,然后利用y.backward()
自动计算得到y在x = 1处的微分,此时x.grad保存的值就是
∂
y
∂
x
∣
x
=
1
=
3
\frac{\partial{y}}{\partial{x}}|_{x = 1}=3
∂x∂y∣x=1=3。
下面是一个更加复杂的例子:
>>> a = torch.tensor([3., 4.], requires_grad = True)
>>> d = torch.norm(a)
>>> d
tensor(5., grad_fn=<NormBackward0>)
>>> d.backward()
>>> a.grad
tensor([0.6000, 0.8000])
上过程为求点 a = ( x , y ) a = (x, y) a=(x,y)的模,即 d = ∣ a ∣ = x 2 + y 2 d = |a| = \sqrt{x^2+y^2} d=∣a∣=x2+y2,可以计算得到偏导数: ∂ d ∂ x ∣ x = 3 = 2 x 2 x 2 + y 2 ∣ x = 3 = 0.6 ∂ d ∂ y ∣ y = 4 = 2 y 2 x 2 + y 2 ∣ y = 4 = 0.8 \frac{\partial{d}}{\partial{x}}|_{x = 3} = \frac{2x}{2\sqrt{x^2 + y^2}}|_{x = 3} = 0.6\\ \frac{\partial{d}}{\partial{y}}|_{y = 4} = \frac{2y}{2\sqrt{x^2 + y^2}}|_{y = 4} = 0.8 ∂x∂d∣x=3=2x2+y22x∣x=3=0.6∂y∂d∣y=4=2x2+y22y∣y=4=0.8
可以看到,上面的例子中输出都是标量。官方的autograd的介绍如下:torch.autograd
提供对任意标量值函数进行自动求导的类和函数。
实际上,torch.autograd
也提供了输出为矢量的自动求导方案:
当输入为矢量,输出为标量时,即
y
=
f
(
x
)
=
f
(
x
1
,
x
2
,
.
.
.
,
x
n
)
y = f(\textbf{x}) = f(x_1, x_2, ..., x_n)
y=f(x)=f(x1,x2,...,xn),则偏导数为:
∇
=
(
∂
y
∂
x
1
,
∂
y
∂
x
2
,
.
.
.
,
∂
y
∂
x
n
)
\nabla = (\frac{\partial{y}}{\partial x_1}, \frac{\partial{y}}{\partial x_2},..., \frac{\partial{y}}{\partial x_n})
∇=(∂x1∂y,∂x2∂y,...,∂xn∂y)
当输入和输出均为矢量时,即 y = ( y 1 , y 2 , . . . , y m ) = f ( x ) = f ( x 1 , x 2 , . . . , x n ) \textbf{y} = (y_1, y_2, ..., y_m) = f(\textbf{x}) = f(x_1, x_2, ..., x_n) y=(y1,y2,...,ym)=f(x)=f(x1,x2,...,xn),则 y \textbf{y} y关于 x \textbf{x} x的微分为雅克比矩阵: J = { ∂ y 1 ∂ x 1 … ∂ y 1 ∂ x n ⋮ ⋱ ⋮ ∂ y m ∂ x 1 … ∂ y m ∂ x n } J = \left\{ \begin{matrix} \frac{\partial{y_1}}{\partial{x_1}} & \ldots & \frac{\partial{y_1}}{\partial{x_n}} \\ \vdots & \ddots & \vdots \\ \frac{\partial{y_m}}{\partial{x_1}} & \ldots & \frac{\partial{y_m}}{\partial{x_n}} \end{matrix} \right\} J=⎩⎪⎨⎪⎧∂x1∂y1⋮∂x1∂ym…⋱…∂xn∂y1⋮∂xn∂ym⎭⎪⎬⎪⎫
我们需要给定一个矢量
v
=
(
v
1
,
v
2
,
…
,
v
m
)
T
v = (v_1,v_2,\ldots,v_m)^T
v=(v1,v2,…,vm)T,将其传递到.backward()
函数中,则torch.auto_grad
就是计算的下式的值:
J
T
⋅
v
=
{
∂
y
1
∂
x
1
…
∂
y
m
∂
x
1
⋮
⋱
⋮
∂
y
1
∂
x
n
…
∂
y
m
∂
x
n
}
{
v
1
⋮
v
m
}
=
{
∂
y
1
∂
x
1
v
1
+
…
+
∂
y
m
∂
x
1
v
m
⋮
∂
y
1
∂
x
n
v
1
+
…
+
∂
y
m
∂
x
n
v
m
}
J^T·v = \left\{ \begin{matrix} \frac{\partial{y_1}}{\partial{x_1}} & \ldots & \frac{\partial{y_m}}{\partial{x_1}} \\ \vdots & \ddots & \vdots \\ \frac{\partial{y_1}}{\partial{x_n}} & \ldots & \frac{\partial{y_m}}{\partial{x_n}} \end{matrix} \right\} \left\{ \begin{matrix} v_1 \\ \vdots \\ v_m \end{matrix} \right\} = \left\{ \begin{matrix} \frac{\partial{y_1}}{\partial{x_1}}v_1+\ldots+ \frac{\partial{y_m}}{\partial{x_1}}v_m\\ \vdots \\ \frac{\partial{y_1}}{\partial{x_n}}v_1+\ldots+ \frac{\partial{y_m}}{\partial{x_n}}v_m \end{matrix} \right\}
JT⋅v=⎩⎪⎨⎪⎧∂x1∂y1⋮∂xn∂y1…⋱…∂x1∂ym⋮∂xn∂ym⎭⎪⎬⎪⎫⎩⎪⎨⎪⎧v1⋮vm⎭⎪⎬⎪⎫=⎩⎪⎨⎪⎧∂x1∂y1v1+…+∂x1∂ymvm⋮∂xn∂y1v1+…+∂xn∂ymvm⎭⎪⎬⎪⎫
注: 矢量
v
v
v的维度和输出的维度
y
y
y的维度必须一致。
下面是一个简单的例子:
>>> x = torch.randn(3, requires_grad = True)
>>> x
tensor([-0.2983, 0.3566, -0.2247], requires_grad=True)
>>> y = x * 2
>>> y
tensor([-0.5967, 0.7132, -0.4495], grad_fn=<MulBackward0>)
>>> y.backward(torch.tensor([1.0, 1.0, 1.0]))
>>> x.grad
tensor([2., 2., 2.])
其中,
y
=
(
y
1
,
y
2
,
y
3
)
,
x
=
(
x
1
,
x
2
,
x
3
)
\textbf{y} = (y_1, y_2, y_3) ,\textbf{x} = (x_1, x_2, x_3)
y=(y1,y2,y3),x=(x1,x2,x3),雅克比矩阵为:
J
=
{
∂
y
1
∂
x
1
∂
y
1
∂
x
2
∂
y
1
∂
x
3
∂
y
2
∂
x
1
∂
y
2
∂
x
2
∂
y
2
∂
x
3
∂
y
3
∂
x
1
∂
y
3
∂
x
2
∂
y
3
∂
x
3
}
=
{
2
0
0
0
2
0
0
0
2
}
J = \left\{ \begin{matrix} \frac{\partial{y_1}}{\partial{x_1}} & \frac{\partial{y_1}}{\partial{x_2}} & \frac{\partial{y_1}}{\partial{x_3}} \\ \frac{\partial{y_2}}{\partial{x_1}} & \frac{\partial{y_2}}{\partial{x_2}} & \frac{\partial{y_2}}{\partial{x_3}} \\ \frac{\partial{y_3}}{\partial{x_1}} & \frac{\partial{y_3}}{\partial{x_2}} & \frac{\partial{y_3}}{\partial{x_3}} \\ \end{matrix} \right\} = \left\{ \begin{matrix} 2 & 0 & 0 \\ 0 & 2 & 0\\ 0& 0 & 2\\ \end{matrix} \right\}
J=⎩⎪⎨⎪⎧∂x1∂y1∂x1∂y2∂x1∂y3∂x2∂y1∂x2∂y2∂x2∂y3∂x3∂y1∂x3∂y2∂x3∂y3⎭⎪⎬⎪⎫=⎩⎨⎧200020002⎭⎬⎫
由于
v
=
(
1.0
,
1.0
,
1.0
)
T
v = (1.0, 1.0, 1.0)^T
v=(1.0,1.0,1.0)T,因此
J
T
v
=
{
2
0
0
0
2
0
0
0
2
}
{
1
1
1
}
=
{
2
2
2
}
J^Tv = \left\{ \begin{matrix} 2 & 0 & 0 \\ 0 & 2 & 0\\ 0& 0 & 2\\ \end{matrix} \right\} \left\{ \begin{matrix} 1\\ 1\\ 1\\ \end{matrix} \right\}= \left\{ \begin{matrix} 2\\ 2\\ 2\\ \end{matrix} \right\}
JTv=⎩⎨⎧200020002⎭⎬⎫⎩⎨⎧111⎭⎬⎫=⎩⎨⎧222⎭⎬⎫
即程序输出的结果。
最后,如果暂时不需要对张量进行自动求导,则可以将其包含在torch.no_grad()
代码块中:
>>> print(x.requires_grad)
True
>>> print((x**2).requires_grad)
True
>>> with torch.no_grad():
... print((x**2).requires_grad)
...
False
注:torch._no_grad()
经常用于测试代码中。