PyTorch nn.Linear 终极详解：从零理解线性层的一切（含可视化+完整代码）。

PyTorch线性层详解与实现

原创于 2025-10-12 23:37:50 发布 · 635 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能

部署运行你感兴趣的模型镜像

PyTorch nn.Linear 基础概念

线性层（nn.Linear）是神经网络中最基础的模块之一，用于实现输入数据的线性变换。其数学表达式为：

$$ y = xW^T + b $$

其中：

$x$ 是输入张量，形状为 (batch_size, in_features)
$W$ 是权重矩阵，形状为 (out_features, in_features)
$b$ 是偏置向量，形状为 (out_features)
$y$ 是输出张量，形状为 (batch_size, out_features)

参数初始化与配置

nn.Linear 在初始化时需要指定两个关键参数：

in_features：输入特征维度
out_features：输出特征维度

import torch.nn as nn

# 创建线性层：输入特征=5，输出特征=3
linear_layer = nn.Linear(in_features=5, out_features=3)

# 查看权重和偏置的形状
print(linear_layer.weight.shape)  # torch.Size([3, 5])
print(linear_layer.bias.shape)    # torch.Size([3])

PyTorch 默认使用均匀分布初始化参数：

权重 $W$：$\mathcal{U}(-\sqrt{k}, \sqrt{k})$, 其中 $k = 1/\text{in_features}$
偏置 $b$：$\mathcal{U}(-\sqrt{k}, \sqrt{k})$

前向传播过程

通过具体示例展示数据流动：

import torch

# 创建输入数据：batch_size=2, in_features=5
x = torch.randn(2, 5)

# 前向计算
output = linear_layer(x)
print(output.shape)  # torch.Size([2, 3])

可视化计算过程：

输入 x (2×5)   权重 W (3×5)    偏置 b (3×1)    
┌─────────┐   ┌─────────┐     ┌─────┐
│ x11 ... │   │ w11 ... │
### PyTorch nn.Linear 基础概念

线性层（`nn.Linear`）是神经网络中最基础的模块之一，用于实现输入数据的线性变换。其数学表达式为：

$$ y = xW^T + b $$

其中：
- $x$ 是输入张量，形状为 `(batch_size, in_features)`
- $W$ 是权重矩阵，形状为 `(out_features, in_features)`
- $b$ 是偏置向量，形状为 `(out_features)`
- $y$ 是输出张量，形状为 `(batch_size, out_features)`

### 参数初始化与配置

`nn.Linear` 在初始化时需要指定两个关键参数：
- `in_features`：输入特征维度
- `out_features`：输出特征维度

```python
import torch.nn as nn

# 创建线性层：输入特征=5，输出特征=3
linear_layer = nn.Linear(in_features=5, out_features=3)

# 查看权重和偏置的形状
print(linear_layer.weight.shape)  # torch.Size([3, 5])
print(linear_layer.bias.shape)    # torch.Size([3])

PyTorch 默认使用均匀分布初始化参数：

权重 $W$：$\mathcal{U}(-\sqrt{k}, \sqrt{k})$, 其中 $k = 1/\text{in_features}$
偏置 $b$：$\mathcal{U}(-\sqrt{k}, \sqrt{k})$

前向传播过程

通过具体示例展示数据流动：

import torch

# 创建输入数据：batch_size=2, in_features=5
x = torch.randn(2, 5)

# 前向计算
output = linear_layer(x)
print(output.shape)  # torch.Size([2, 3])

可视化计算过程：

输入 x (2×5)   权重 W (3×5)    偏置 b (3×1)    
┌─────────┐   ┌─────────┐     ┌─────┐
│ x11 ... │   │ w11 ... │

您可能感兴趣的与本文相关的镜像

PyTorch 2.7

PyTorch

Cuda

PyTorch 是一个开源的 Python 机器学习库，基于 Torch 库，底层由 C++ 实现，应用于人工智能领域，如计算机视觉和自然语言处理