一、任务背景
本实验实现一个五分类任务的神经网络模型,核心逻辑是:输入长度为5的向量,预测其中最大数值所在位置的索引。例如输入[0.2, 0.9, 0.3, 0.1, 0.4]时,模型应预测类别1。
二、模型结构解析
class Model(nn.Module):
def __init__(self, input_size, output_size):
super(Model, self).__init__()
self.linear1 = nn.Linear(input_size, output_size, bias=True)
self.loss = nn.CrossEntropyLoss()
def forward(self, x, y=None):
x = self.linear1(x)
y_pred = x
if y is not None:
return self.loss(x, y)
else:
return torch.softmax(y_pred, dim=-1)
- 结构设计:单层线性网络直接连接输出层,输出维度与类别数相同
- 前向传播:当传入标签y时返回交叉熵损失,无标签时返回softmax后的概率分布
- 数学原理:交叉熵损失函数自动执行log_softmax,因此直接使用线性层输出
- 参数维度:输入特征数input_size=5,输出类别数output_size=5
三、数据生成模块
def create_sample():
x = np.random.rand(5) # 生成5维随机向量
y = np.argmax(x) # 取最大值索引作为标签
return x, y
- 数据特性:每个样本的特征向量服从均匀分布$U(0,1)$
- 标签规则:通过argmax直接获取真实类别,保证数据完全线性可分
- 数据集构建:将生成的Numpy数组转换为PyTorch张量
四、训练流程详解
def main():
# 超参数设置
batch_size = 20
sample_num = 5000
epoch = 20
lr = 0.0025
# 训练循环
for _ in range(epoch):
for i in range(sample_num // batch_size):
# 获取当前批次数据
x = X[i*batch_size: (i+1)*batch_size]
y = Y[i*batch_size: (i+1)*batch_size]
# 前向传播与损失计算
loss = model.forward(x, y)
# 反向传播三部曲
loss.backward() # 梯度计算
optim.step() # 参数更新
optim.zero_grad() # 梯度清零
- 优化策略:使用Adam优化器,学习率设置为0.0025
- 批次训练:将5000个样本划分为250个batch(每个batch 20个样本)
- 梯度管理:每次更新后必须清空梯度,防止梯度累积
五、模型评估与预测
@torch.no_grad
def evaluate(model):
model.eval() # 切换评估模式
# ...计算准确率和损失...
@torch.no_grad
def predict(model):
x = torch.FloatTensor(np.random.rand(5))
y_pred = model.forward(x)
print(f"Predicted class: {torch.argmax(y_pred)}")
- 评估模式:关闭Dropout和BatchNorm等训练专用层
- 无梯度计算:使用装饰器禁止梯度跟踪,提升推理效率
- 结果解析:通过argmax获取最终预测类别
六、实验结果分析
- 该简单线性模型在此线性可分数据集上可快速达到100%准确率
- 损失曲线呈现稳定下降趋势
- 验证集与训练集表现一致,无过拟合现象
关键点总结:本实现展示了PyTorch模型开发的标准流程,包含数据生成、模型定义、训练循环、评估测试等完整环节。虽然使用简单线性层,但揭示了分类任务的核心实现逻辑。
七、完整代码
import numpy as np
import torch
import torch.nn as nn
# 五分类任务,最大的数所在的下标就是类别
class Model(nn.Module):
def __init__(self, input_size, output_size):
super(Model, self).__init__()
self.linear1 = nn.Linear(input_size, output_size, bias=True)
self.loss = nn.CrossEntropyLoss()
def forward(self, x, y=None):
x = self.linear1(x)
y_pred = x
if y is not None:
return self.loss(x, y)
else:
return torch.softmax(y_pred, dim=-1)
def create_sample():
x = np.random.rand(5)
y = np.argmax(x)
return x, y
def create_dataset(sample_num):
X, Y = [], []
for _ in range(sample_num):
x, y = create_sample()
X.append(x)
Y.append(y)
return torch.FloatTensor(X), torch.LongTensor(Y)
@torch.no_grad
def evaluate(model):
model.eval()
sample_num = 200
correct, wrong = 0, 0
x, y = create_dataset(sample_num)
y_pred = model.forward(x) # 前向传播
loss = model.forward(x, y) # 计算loss
for i, j in zip(y_pred, y):
if torch.argmax(i) == int(j):
correct += 1
else:
wrong += 1
acc = correct/(correct+wrong)
print(f"acc: {acc}, loss: {loss}")
return acc, loss
def main():
batch_size = 20
sample_num = 5000
input_size = 5
output_size = 5
epoch = 20
lr = 0.0025
model = Model(input_size, output_size)
optim = torch.optim.Adam(model.parameters(), lr)
model.train()
X, Y = create_dataset(sample_num)
epoch_loss = []
for _ in range(epoch):
for i in range(sample_num // batch_size):
x = X[i*batch_size: (i+1)*batch_size]
y = Y[i*batch_size: (i+1)*batch_size]
y_pred = model.forward(x) #前向传播
loss = model.forward(x, y) #计算loss
loss.backward() #反向传播
optim.step() #梯度下降
optim.zero_grad() #梯度清零
epoch_loss.append(loss.item())
print(np.mean(epoch_loss))
evaluate(model)
torch.save(model,"model.pth")
@torch.no_grad
def predict(model):
model.eval()
x = torch.FloatTensor(np.random.rand(5))
y_pred = model.forward(x) # 前向传播
print(f"input: {x}, output: {y_pred}, class: {torch.argmax(y_pred)}")
return
if __name__ == '__main__':
main()
model = torch.load("model.pth", weights_only=False)
print(model.state_dict())
predict(model)