我没想到这篇文章会受到这么多关注,所以决定重新整理内容,补充更多细节,希望能帮助大家更好地理解。如果遇到任何问题,欢迎在评论区留言,我会尽力解答!
这段代码是一个用于深度学习模型训练的Python脚本,目前可以运行目标检测任务,使用了YOLOv11(You Only Look Once)算法。代码中包含了多个步骤,每个步骤都是模型训练过程中的一个阶段。以下是对代码的详细解释:
完整版代码在GitHub上:yolov11_prune_distillation_v2
1、导入必要的库和模块:
ray==2.44.1
torch==2.6.0
torchaudio==2.6.0
torchvision==0.21.0
python:3.10.16
Ultralytics YOLO 🚀:8.3.28
系统:Linux
CUDA Version: 12.4
显卡:NVIDIA GeForce RTX 4090
详细requirements.txt放在GitHub中,不是每一个包都需要安装。
2、训练步骤:
具体操作步骤如下,使用时只需取消对应步骤前的注释即可
if __name__ == '__main__':
# step1_train()
step2_constraint_train()
# step3_pruning()
# step4_finetune()
# step5_distillation()
2.1 step1_train()
这一步用于加载预训练模型并启动训练流程,主要需修改以下两个参数:pretrained_model_path(预训练模型路径)和 yaml_path(训练数据存储目录),然后启动训练
## 配置step1路径
pretrained_model_path = os.path.join(root, "yolo11n.pt")
yaml_path = os.path.join(root, "data.yaml")
2.2 step2_Constraint_train()
这一步需要修改两个地方,我们的目的是在模型中的BN层添加L1正则以实现稀疏化训练,为后续剪枝做准备。首先,将step1训练得到的模型权重路径填入step1_train_model_path,
## 配置step2路径
step1_train_model_path = os.path.join(root, 'runs/detect/train/weights/best.pt')
step2_constraint_train_model_path = os.path.join(root, "runs/detect/Constraint")
接着在./ultralytics/engine/trainer.py中取消注释以下内容,然后启动step2训练
# add start=============================
# add l1 regulation for step2_Constraint_train
l1_lambda = 1e-2 * (1 - 0.9 * epoch / self.epochs)
for k, m in self.model.named_modules():
if isinstance(m, nn.BatchNorm2d):
m.weight.grad.data.add_(l1_lambda * torch.sign(m.weight.data))
m.bias.grad.data.add_(1e-2 * torch.sign(m.bias.data))
# add end ==============================
训练完成后,在./ultralytics/engine/trainer.py中注释上面内容。
2.3 step3_pruning()
对模型进行剪枝,以减少模型的复杂度。需要修改的参数由下所示,其中pruning_rate为剪枝率,然后启动step3。
## 配置step3路径
pruning_rate = 0.8
step3_prune_before_model_path = os.path.join(step2_constraint_train_model_path, "weights/last.pt")
step3_prune_after_model_path = os.path.join(step2_constraint_train_model_path, "weights/prune.pt")
2.4 step4_finetune()
微调剪枝后的模型,然后启动step4。
## 配置step4路径
step4_finetune_model_path = os.path.join(root, "runs/detect/finetune")
2.5 step5_distillation()
使用知识蒸馏技术,将一个训练好的大模型(教师模型)的知识传递给一个较小的模型(学生模型)。如果按照上述步骤,那么老师模型为第一步训练好的模型,学生模型为第四步训练好的模型。
## 配置step5路径
step5_teacher_model_path = step1_train_model_path
step5_student_model_path = os.path.join(step4_finetune_model_path, 'weights/best.pt')
step5_output_model_path = os.path.join(root, "runs/detect/student")
这一步骤比较灵活,layers表示学生模型需要学习老师模型的层数,也可以在这一步给学生模型添加注意力。
def step5_distillation():
layers = ["6", "8", "13", "16", "19", "22"]
model_t = YOLO(step5_teacher_model_path) # the teacher model
model_s = YOLO(step5_student_model_path) # the student model
model_s = add_attention(model_s)
model_s.train(data=yaml_path, Distillation=model_t.model, loss_type='mgd', layers=layers, amp=False, imgsz=1280,
epochs=300,
batch=2, device=0, workers=0, lr0=0.001, name=step5_output_model_path)
4. 训练函数参数解释:
- `data`:指定数据配置文件的路径。
- `device`:指定训练使用的设备,如GPU。
- `imgsz`:指定输入图像的大小。
- `epochs`:指定训练的轮数。
- `batch`:指定每批训练的样本数量。
- `workers`:指定用于数据加载的工作线程数量。
- `save_period`:指定保存模型的周期。
- `name`:指定模型保存的路径。
- `amp`:指定是否使用自动混合精度训练。
- `Distillation`:指定知识蒸馏的教师模型。
- `loss_type`:指定损失函数的类型。
- `layers`:指定进行蒸馏的层。
from ultralytics import YOLO
import os
from utils.yolo.attention import add_attention
root = os.getcwd()
## 配置step1路径
pretrained_model_path = os.path.join(root, "yolo11n.pt")
yaml_path = os.path.join(root, "data.yaml")
## 配置step2路径
step1_train_model_path = os.path.join(root, 'runs/detect/train3/weights/best.pt')
step2_constraint_train_model_path = os.path.join(root, "runs/detect/Constraint")
## 配置step3路径
pruning_rate = 0.8
step3_prune_before_model_path = os.path.join(step2_constraint_train_model_path, "weights/last.pt")
step3_prune_after_model_path = os.path.join(step2_constraint_train_model_path, "weights/prune.pt")
## 配置step4路径
step4_finetune_model_path = os.path.join(root, "runs/detect/finetune")
## 配置step5路径
step5_teacher_model_path = step1_train_model_path
step5_student_model_path = os.path.join(step4_finetune_model_path, 'weights/best.pt')
step5_output_model_path = os.path.join(root, "runs/detect/student")
def step1_train():
model = YOLO(pretrained_model_path)
model.train(data=yaml_path, device="0", imgsz=640, epochs=50, batch=2, workers=0, save_period=1) # train the model
def step2_constraint_train():
model = YOLO(step1_train_model_path)
model.train(data=yaml_path, device="0", imgsz=640, epochs=50, batch=2, amp=False, workers=0, save_period=1,
name=step2_constraint_train_model_path) # train the model
def step3_pruning():
# from utils.yolo.seg_pruning import do_pruning use for seg
from utils.yolo.det_pruning import do_pruning # use for det
do_pruning(step3_prune_before_model_path, step3_prune_after_model_path, pruning_rate)
def step4_finetune():
model = YOLO(step3_prune_after_model_path) # load a pretrained model (recommended for training)
for param in model.parameters():
param.requires_grad = True
model.train(data=yaml_path, device="0", imgsz=640, epochs=200, batch=2, workers=0,
name=step4_finetune_model_path) # train the model
def step5_distillation():
layers = ["6", "8", "13", "16", "19", "22"]
model_t = YOLO(step5_teacher_model_path) # the teacher model
model_s = YOLO(step5_student_model_path) # the student model
model_s = add_attention(model_s)
model_s.train(data=yaml_path, Distillation=model_t.model, loss_type='mgd', layers=layers, amp=False, imgsz=1280,
epochs=300,
batch=2, device=0, workers=0, lr0=0.001, name=step5_output_model_path)
if __name__ == '__main__':
# step1_train()
# step2_constraint_train()
# step3_pruning()
# step4_finetune()
step5_distillation()
1122





