Pytorch总结

最新推荐文章于 2025-04-12 01:11:07 发布

CharlesDavid_coder

最新推荐文章于 2025-04-12 01:11:07 发布

阅读量469

点赞数

分类专栏： pytorch 文章标签： pytorch python 深度学习人工智能

本文链接：https://blog.youkuaiyun.com/qq_27396609/article/details/112248986

版权

Pytorch总结

文章目录

Pytorch总结
Pytorch的安装
Pytorch的入门使用
梯度下降和反向传播
Pytorch完成线性回归
Pytorch完成基础的模型
Pytorch中的数据加载
使用Pytorch实现手写数字识别

Pytorch的安装

目标

知道如何安装pytorch

1. Pytorch的介绍

Pytorch是一款facebook发布的深度学习框架，由其易用性，友好性，深受广大用户青睐。

2. Pytorch的版本

在这里插入图片描述

3. Pytorch的安装

安装地址介绍：https://pytorch.org/get-started/locally/

带GPU安装步骤：

conda install pytorch torchvision cudatoolkit=9.0 -c pytorch

不带GPU安装步骤

conda install pytorch-cpu torchvision-cpu -c pytorch

pip install torch1.7.1+cpu torchvision0.8.2+cpu torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

安装之后打开ipython

输入：

In [1]: import torch
In [2]: torch.__version__
Out[2]: '1.7.1+cpu'

注意：安装模块的时候安装的是pytorch ，但是在代码中都是使用torch

Pytorch的入门使用

目标

知道张量和Pytorch中的张量
知道pytorch中如何创建张量
知道pytorch中tensor的常见方法
知道pytorch中tensor的数据类型
知道pytorch中如何实现tensor在cpu和cuda中转化

1. 张量Tensor

张量是一个统称，其中包含很多类型：

0阶张量：标量、常数，0-D Tensor
1阶张量：向量，1-D Tensor
2阶张量：矩阵，2-D Tensor
3阶张量
…
N阶张量

2. Pytorch中创建张量

使用python中的列表或者序列创建tensor

torch.tensor([[1., -1.], [1., -1.]])
tensor([[ 1.0000, -1.0000],
        [ 1.0000, -1.0000]])

使用numpy中的数组创建tensor

torch.tensor(np.array([[1, 2, 3], [4, 5, 6]]))
tensor([[ 1,  2,  3],
        [ 4,  5,  6]])

使用torch的api创建tensor
1. torch.empty(3,4)创建3行4列的空的tensor，会用无用数据进行填充
2. torch.ones([3,4]) 创建3行4列的全为1的tensor
3. torch.zeros([3,4])创建3行4列的全为0的tensor
4. torch.rand([3,4]) 创建3行4列的随机值的tensor，随机值的区间是[0, 1)
```
>>> torch.rand(2, 3)
tensor([[ 0.8237,  0.5781,  0.6879],
[ 0.3816,  0.7249,  0.0998]])
```
5. torch.randint(low=0,high=10,size=[3,4]) 创建3行4列的随机整数的tensor，随机值的区间是[low, high)
```
>>> torch.randint(3, 10, (2, 2))
tensor([[4, 5],
	[6, 7]])
```
6. torch.randn([3,4]) 创建3行4列的随机数的tensor，随机值的分布式均值为0，方差为1

3. Pytorch中tensor的常用方法

获取tensor中的数据(当tensor中只有一个元素可用)：tensor.item()

In [10]: a = torch.tensor(np.arange(1))

In [11]: a
Out[11]: tensor([0])

In [12]: a.item()
Out[12]: 0

转化为numpy数组

In [55]: z.numpy()
Out[55]:
array([[-2.5871205],
       [ 7.3690367],
       [-2.4918075]], dtype=float32)

获取形状：tensor.size()

In [72]: x
Out[72]:
tensor([[    1,     2],
        [    3,     4],
        [    5,    10]], dtype=torch.int32)

In [73]: x.size()
Out[73]: torch.Size([3, 2])

形状改变：tensor.view((3,4))。类似numpy中的reshape，是一种浅拷贝，仅仅是形状发生改变

In [76]: x.view(2,3)
Out[76]:
tensor([[    1,     2,     3],
        [    4,     5,    10]], dtype=torch.int32)

获取阶数：tensor.dim()
```
In [77]: x.dim()
Out[77]: 2
```

获取最大值：tensor.max()

In [78]: x.max()
Out[78]: tensor(10, dtype=torch.int32)

转置、轴交换：tensor.t() tensor.transpose() tensor.permute()

In [79]: x.t()
Out[79]:
tensor([[    1,     3,     5],
        [    2,     4, 	  10]], dtype=torch.int32)
 t3 = torch.Tensor(np.arange(24).reshape(2, 3, 4))
 t3
 # tensor([[[ 0.,  1.,  2.,  3.],
 #          [ 4.,  5.,  6.,  7.],
 #          [ 8.,  9., 10., 11.]],
 #
 #         [[12., 13., 14., 15.],
 #          [16., 17., 18., 19.],
 #          [20., 21., 22., 23.]]])
 t3.size()
 # torch.Size([2, 3, 4])
 t3.transpose(0, 1)
 # tensor([[[ 0.,  1.,  2.,  3.],
 #          [12., 13., 14., 15.]],
 #
 #         [[ 4.,  5.,  6.,  7.],
 #          [16., 17., 18., 19.]],
 #
 #         [[ 8.,  9., 10., 11.],
 #          [20., 21., 22., 23.]]])
 t3.permute(0, 1, 2)
 # tensor([[[ 0.,  1.,  2.,  3.],
 #          [ 4.,  5.,  6.,  7.],
 #          [ 8.,  9., 10., 11.]],
 #
 #         [[12., 13., 14., 15.],
 #          [16., 17., 18., 19.],
 #          [20., 21., 22., 23.]]])
 t3.permute(1, 0, 2)
 # tensor([[[ 0.,  1.,  2.,  3.],
 #          [12., 13., 14., 15.]],
 #
 #         [[ 4.,  5.,  6.,  7.],
 #          [16., 17., 18., 19.]],
 #
 #         [[ 8.,  9., 10., 11.],
 #          [20., 21., 22., 23.]]])
 t3.permute(1, 2, 0)
 # tensor([[[ 0., 12.],
 #          [ 1., 13.],
 #          [ 2., 14.],
 #          [ 3., 15.]],
 #
 #         [[ 4., 16.],
 #          [ 5., 17.],
 #          [ 6., 18.],
 #          [ 7., 19.]],
 #
 #         [[ 8., 20.],
 #          [ 9., 21.],
 #          [10., 22.],
 #          [11., 23.]]])

tensor[1,3] 获取tensor中第一行第三列的值
tensor[1,3]=100 对tensor中第一行第三列的位置进行赋值100
tensor的切片

In [101]: x
Out[101]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])

In [102]: x[:,1]
Out[102]: tensor([1.9439, 1.9575, 1.0123, 1.5939, 1.4858])

4. tensor的数据类型

tensor中的数据类型非常多，常见类型如下：

在这里插入图片描述

上图中的Tensor types表示这种type的tensor是其实例

获取tensor的数据类型:tensor.dtype
```
In [80]: x.dtype
Out[80]: torch.int32
```

创建数据的时候指定类型

In [88]: torch.ones([2,3],dtype=torch.float32)
Out[88]:
tensor([[9.1167e+18, 0.0000e+00, 7.8796e+15],
        [8.3097e-43, 0.0000e+00, -0.0000e+00]])

类型的修改

In [17]: a
Out[17]: tensor([1, 2], dtype=torch.int32)

In [18]: a.type(torch.float)
Out[18]: tensor([1., 2.])

In [19]: a.double()
Out[19]: tensor([1., 2.], dtype=torch.float64)

5. tensor的其他操作

tensor和tensor相加

In [94]: x = x.new_ones(5, 3, dtype=torch.float)

In [95]: y = torch.rand(5, 3)

In [96]: x+y
Out[96]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])
In [98]: torch.add(x,y)
Out[98]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])
In [99]: x.add(y)
Out[99]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])
In [100]: x.add_(y)  #带下划线的方法会对x进行就地修改
Out[100]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])

In [101]: x #x发生改变
Out[101]:
tensor([[1.6437, 1.9439, 1.5393],
        [1.3491, 1.9575, 1.0552],
        [1.5106, 1.0123, 1.0961],
        [1.4382, 1.5939, 1.5012],
        [1.5267, 1.4858, 1.4007]])

注意：带下划线的方法（比如:add_)会对tensor进行就地修改

tensor和数字操作

In [97]: x +10
Out[97]:
tensor([[11., 11., 11.],
        [11., 11., 11.],
        [11., 11., 11.],
        [11., 11., 11.],
        [11., 11., 11.]])

CUDA中的tensor

CUDA（Compute Unified Device Architecture），是NVIDIA推出的运算平台。 CUDA™是一种由NVIDIA推出的通用并行计算架构，该架构使GPU能够解决复杂的计算问题。

torch.cuda这个模块增加了对CUDA tensor的支持，能够在cpu和gpu上使用相同的方法操作tensor

通过.to方法能够把一个tensor转移到另外一个设备(比如从CPU转到GPU)

#device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
if torch.cuda.is_available():
    device = torch.device("cuda")          # cuda device对象
    y = torch.ones_like(x, device=device)  # 创建一个在cuda上的tensor
    x = x.to(device)                       # 使用方法把x转为cuda 的tensor
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # .to方法也能够同时设置类型
    
>>tensor([1.9806], device='cuda:0')
>>tensor([1.9806], dtype=torch.float64)

通过前面的学习，可以发现torch的各种操作几乎和numpy一样

梯度下降和反向传播

目标

知道什么是梯度下降
知道什么是反向传播

1. 梯度是什么?

梯度：是一个向量，导数+变化最快的方向(学习的前进方向)

回顾机器学习

收集数据 $x$ ，构建机器学习模型 $f$ ，得到 $f(x,w) = Y_{predict}$

判断模型好坏的方法：
$\begin{aligned} loss & = (Y_{predict}-Y_{true})^2 &(回归损失) \\ loss & = Y_{true} \cdot log(Y_{predict}) &(分类损失) \end{aligned}$

目标：通过调整(学习)参数 $w$ ，尽可能的降低 $l o s s$ ，那么我们该如何调整 $w$ 呢？

在这里插入图片描述

随机选择一个起始点 $w_0$ ,通过调整 $w_0$ ，让loss函数取到最小值

在这里插入图片描述

$w$ 的更新方法：

计算 $w$ 的梯度（导数）

$\begin{aligned} \nabla w = \frac{f(w+0.000001)-f(w-0.000001)}{2*0.000001} \end{aligned}$

更新 $w$
$\alpha \nabla w$

其中：

$\nabla w < 0$ ,意味着w将增大
$\nabla w > 0$ ,意味着w将减小

总结：梯度就是多元函数参数的变化趋势（参数学习的方向），只有一个自变量时称为导数

2. 偏导的计算

2.1 常见的导数计算

多项式求导数： $f(x) = x^5$ , $f^{'}(x) = 5x^{(5-1)}$
基本运算求导： $f (x) = x y$ ， $f^{'}(x) = y$
指数求导： $f(x) = 5e^x$ ， $f^{'}(x) = 5e^x$
对数求导： $f (x) = 5 l n x$ ， $f^{'}(x) = \frac{5}{x}$ ，ln 表示log以e为底的对数
导数的微分形式：
$\begin{aligned} & f^{'}(x) = & \frac{d f(x)}{dx} \\ & 牛顿 &莱布尼兹 \end{aligned}$