YOLO中model.train()参数详解

最新推荐文章于 2025-06-13 07:57:19 发布

原创最新推荐文章于 2025-06-13 07:57:19 发布 · 1.1k 阅读

15 ·

CC 4.0 BY-SA版权

文章标签：

#YOLO #人工智能 #机器学习

!标红的参数表示调了可能对自己的训练有所帮助，其他的直接用默认值也合适了

Train Settings

arg： $model\textcolor{red}{model}$
type：str
default：None
Description：

Specifies the model file for training. Accepts a path to either a .pt
pretrained model or a .yaml configuration file. Essential for defining
the model structure or initializing weight

explain：可以指定训练用的模型，参数内容是可以是pt、yaml文件路径，也可以不指定，好像会默认加载最新的YOLO模型，比如现在是YOLOv11
sample：model.train(model=‘./weights/yolov8n_pretrained.pt’)

arg： $data\textcolor{red}{data}$
type：str
default：None
Description：

Path to the dataset configuration file (e.g., coco8.yaml). This file
contains dataset-specific parameters, including paths to training and
validation data, class names, and number of classes.

explain：指定数据配置文件，配置文件应该包括训练样本与测试样本的数据路径、每个类的名字以及标号
sample：model.train(‘./data/datasets/cowboy_outfits/cowboy_outfits.yaml’)

arg：epochs
type：int
default：100
很明显，不介绍了
arg：time
type：float
default：None
Description：

Maximum training time in hours. If set, this overrides the epochs argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.

explain：训练的最大时间，如果训练到了time之后还没有结束，则停止它，这个参数适合在受GPU时间资源限制的情况下使用

arg：patience
type：int
default：100
Description：

Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.

explain：就是连续patience个迭代周期validation matrics（验证指标）都没有改善，证明模型性能已经达到平稳状态了，此时就停止训练，避免训练下去过拟合

arg：batch
type：int
default：16
Description：很明显，不解释了
arg：imgsz
type：int or list
default：640
explain：其实就是样本要统一成的图片尺寸
arg：save
type：bool
default：True
Description：

Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.

explain：支持保存训练检查点和最终的模型权重。这对于恢复训练或进行模型部署很有用。

arg： $save_period\textcolor{red}{save\_period}$
type：int
default：-1
Description：

Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.

explain：该参数是保存模型断点的频率，比如save_period=10，则每10个epochs保留一次断点。设为-1表示不启用这个功能。

arg：cache
type：bool
default：False
Description：

Enables caching of dataset images in memory (True/ram), on disk (disk), or disables it (False). Improves training speed by reducing disk I/O at the cost of increased memory usage.

explain：是否启用缓存cache，如果启用可以加快数据读取速度，但是会增加显存cost

arg： $device\textcolor{red}{device}$
type：int or list or str
default：None
Description：

Specifies the computational device(s) for training: a single GPU (device=0), multiple GPUs (device=0,1), CPU (device=cpu), or MPS for Apple silicon (device=mps).

explain：指定训练使用的设备，device=0表示用单GPU，device=0,1表示用多GPU，device=cpu表示用cpu

arg：workers
type：int
default：8
Description：

Number of worker threads for data loading (per RANK if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.

explain：要用多少个进程加载数据，越多的话加载数据越快，在多GPU训练中尤为有效

arg：project
type：str
default：None
Description：

Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.

explain：训练的输出保存到的路径，一般好像是‘runs’文件夹

arg： $name\textcolor{red}{name}$
type：str
default：None
Description：

Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.

explain：指定这次训练是什么名字，可以方便找到对应训练的日志输出，默认应该是‘train’吧，如下：
在这里插入图片描述

arg： $exist_ok\textcolor{red}{exist\_ok}$
type：bool
default：False
Description：

If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.

explain：如果已存在名字与这次训练一致的日志输出，设exist_ok=True可以重写日志，否则另开新的日志来写，只需要保留最新一次训练的日志输出可以设为True。

arg：pretrained
type：bool or str
default：True
Description：

Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.

explain：决定是否用预训练模型来训练，我感觉跟第一个参数model可能会重叠，因为model参数也可以让模型拥有预训练权重，此时pretrained参数还有意义吗

arg：optimizer
type：str
default：‘auto’
Description：

Choice of optimizer for training. Options include SGD, Adam, AdamW, NAdam, RAdam, RMSProp etc., or auto for automatic selection based on model configuration. Affects convergence speed and stability.

explain：优化器，有Adam、SGD多种选择，或者设为‘auto’，此时框架自动根据模型结构选择较合适的优化器，但此时的lr、momentum参数也会自动设置，所以下边的这两个参数手动设置也没用了。

arg：seed
type：int
default：0
Description：

Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.

explain：设置随机种子（应该是作用于数据增广等处），保证相同的配置可以让结果复现。

arg：deterministic
type：bool
default：True
Description：

Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.

explain：强制使用确定性算法，保证了可复现性，但是可能会影响性能以及速度。

arg：single_cls
type：bool
default：False
Description：

Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.

explain：将所有类视为一类，进化成二分类问题，如果只需要识别到物体，而不关心物体是什么，则可以用这个选项。

arg：classes
type：list[int]
default：None
Description：

Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training.

explain：只关心某些类的识别，不关心其他的

arg：rect
type：bool
default：False
Description：

Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.

explain：允许矩阵训练，就是为每个批次寻找最相似的尺寸，这样可以是他们resize到该尺寸时填充最少，该选项可以改善效率、速度，但是可能会影响精确率。

arg： $multi_scale\textcolor{red}{multi\_scale}$
type：bool
default：False
Description：

Enables multi-scale training by increasing/decreasing imgsz by up to a factor of 0.5 during training. Trains the model to be more accurate with multiple imgsz during inference.

explain：通过在训练期间将图像尺寸（imgsz）增大或减小最多 0.5 倍来启用多尺度训练。这样能训练模型，使其在推理阶段对于多种图像尺寸具有更高的准确性。

arg： $cos_lr\textcolor{red}{cos\_lr}$
type：bool
default：False
Description：

Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.

explain：是否采用余弦学习率调度器，采用的话可能更容易收敛

arg：close_mosaic
type：int
default：10
Description：

Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.

explain：在最后 N 个训练轮次（epochs）中禁用马赛克数据增强，以在训练结束前稳定训练过程。若将该参数设置为 0，则禁用此功能。

arg： $resume\textcolor{red}{resume}$
type：bool
default：False
Description：

Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.

explain：从最后保存的检查点恢复训练。它会自动加载模型权重、优化器状态和已训练的轮数（epoch数），从而无缝地继续训练过程。

arg：amp
type：bool
default：True
Description：

Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy

explain：是否采取混合精度，用的话可以加速训练，但有一点点可能会影响精度

arg：fraction
type：float
default：1.0
Description：

Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.

explain：指定用于训练的数据集比例。这允许在完整数据集的一个子集上进行训练，在进行实验或者资源有限的情况下非常有用。

arg：profile
type：bool
default：False
Description：

Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.

explain：在训练期间启用对 ONNX（开放神经网络交换格式）和 TensorRT（英伟达的深度学习推理优化器）推理速度的分析，这对于优化模型部署很有帮助。

arg：freeze
type：int or list
default：None
Description：

Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.

explain：冻结模型的前 N 层或通过索引指定的层，从而减少可训练参数的数量。这在微调模型或进行迁移学习时非常有用。

arg：lr0
type：float
default：0.01
Description：

Initial learning rate (i.e. SGD=1E-2, Adam=1E-3). Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.

explain：初始的学习率

arg：lrf
type：float
default：0.01
Description：

Final learning rate as a fraction of the initial rate = (lr0 * lrf), used in conjunction with schedulers to adjust the learning rate over time.

explain：最终学习率是初始学习率的一个比例，即等于（初始学习率 lr0 乘以最终学习率因子 lrf），它与学习率调度器配合使用，以便随着时间推移来调整学习率。

arg：momentum
type：float
default：0.937
Description：

Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.

explain：对于随机梯度下降（SGD）优化器而言，它是动量因子；对于自适应矩估计（Adam）优化器来说，它是 beta1 参数，该参数会影响在当前更新中对过往梯度的整合情况。 动量能够对噪声进行平滑处理，因为它会综合考虑过去多个步骤的梯度信息，而不是仅仅依赖当前的梯度，可以让梯度更新更稳定

arg：weight_decay
type：float
default：0.0005
Description：

L2 regularization term, penalizing large weights to prevent overfitting.

explain：添加正则项，避免过拟合

arg：warmup_epochs
type：float
default：3.0
Description：

Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.

explain：学习率预热（warmup）的轮数（epoch数），即从一个较低的值逐渐将学习率提升到初始学习率，以便在训练初期稳定训练过程。

arg：warmup_momentum
type：float
default：0.8
Description：

Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.

explain：预热阶段的初始动量，在预热期间逐渐调整到设定的动量值。

arg：warmup_bias_lr
type：float
default：0.1
Description：

Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.

explain：预热阶段偏置参数的学习率，有助于在训练初始轮次稳定模型训练。

arg：box
type：float
default：7.5
Description：

Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.

explain：损失函数中边界框损失项的权重，它影响着模型在准确预测边界框坐标方面所给予的重视程度。

arg：cls
type：float
default：0.5
Description：

Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.

explain：总损失函数中分类损失的权重，影响正确类别预测相对于其他损失分量的重要性。

arg：dfl
type：float
default：1.5
Description：

Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.

explain：分布焦点损失（distribution focal loss）的权重，在某些版本的 YOLO 中用于进行细粒度的分类任务。

arg：pose
type：float
default：12.0
Description：

Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.

explain：用于姿态估计，在进行姿态估计训练的模型中，姿态损失的权重会影响模型对精确预测姿态关键点的重视程度。

arg：kobj
type：float
default：2.0
Description：

Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.

explain：在姿态估计模型中，关键点目标性损失的权重，用于平衡检测置信度与姿态准确性。

arg：nbs
type：int
default：64
Description：

Nominal batch size for normalization of loss.

explain：用于对损失进行归一化处理的标称批量大小。累积到64个样本才开始计算损失，可以让更新更平滑。

arg：overlap_mask
type：bool
default：True
Description：

Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlaid on top of the larger mask during merge.

explain：用于实例分割，用于确定在训练时是否应将目标实例的掩码合并为一个单一掩码，还是对每个目标分别保留其独立的掩码。如果存在掩码重叠的情况，在进行合并时，较小的掩码会覆盖在较大的掩码之上。

arg：mask_ratio
type：int
default：4
Description：

Downsample ratio for segmentation masks, affecting the resolution of masks used during training.

explain：分割掩码的下采样比率，它会影响训练过程中所使用的掩码的分辨率。

arg：dropout
type：float
default：0.0
Description：

Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.

explain：在分类任务中用于正则化的随机失活（Dropout）率，通过在训练过程中随机忽略（丢弃）神经元来防止过拟合。

arg： $val\textcolor{red}{val}$
type：bool
default：True
Description：

Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.

explain：在训练过程中启用验证功能，使得能够在一个独立的数据集上定期评估模型的性能。

arg： $plots\textcolor{red}{plots}$
type：bool
default：False
Description：

Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

explain：生成并保存训练和验证指标的图表，以及预测示例，为模型的性能表现和学习进展提供可视化的洞察。

Augmentation Settings and Hyperparameters

！一般不用改，看看每个参数是什么意思就好

Argument	Type	Default	Range	Description	Explain
hsv_h	float	0.015	0.0 - 1.0	Adjusts the hue of the image by a fraction of the color wheel, introducing color variability. Helps the model generalize across different lighting conditions.	通过色轮的一定比例来调整图像的色调，引入颜色的多样性。这有助于模型在不同的光照条件下实现泛化。
hsv_s	float	0.7	0.0 - 1.0	Alters the saturation of the image by a fraction, affecting the intensity of colors. Useful for simulating different environmental conditions.	按一定比例改变图像的饱和度，影响颜色的鲜艳程度。这对于模拟不同的环境条件很有用。
hsv_v	float	0.4	0.0 - 1.0	Modifies the value (brightness) of the image by a fraction, helping the model to perform well under various lighting conditions.	按一定比例修改图像的明度（亮度），帮助模型在各种光照条件下都能有良好的表现。
degrees	float	0.0	-180 - +180	Rotates the image randomly within the specified degree range, improving the model’s ability to recognize objects at various orientations.	在指定的度数范围内随机旋转图像，提升模型识别处于各种方向的物体的能力。
translate	float	0.1	0.0 - 1.0	Translates the image horizontally and vertically by a fraction of the image size, aiding in learning to detect partially visible objects.	将图像在水平和垂直方向上按照图像尺寸的一定比例进行平移，有助于学习检测部分可见的物体。
scale	float	0.5	>=0.0	Scales the image by a gain factor, simulating objects at different distances from the camera.	通过一个增益因子对图像进行缩放，模拟物体与相机处于不同距离时的情况。
shear	float	0.0	-180 - +180	Shears the image by a specified degree, mimicking the effect of objects being viewed from different angles.	按照指定的角度对图像进行错切变换，模拟从不同角度观察物体的效果。
perspective	float	0.0	0.0 - 0.001	Applies a random perspective transformation to the image, enhancing the model’s ability to understand objects in 3D space.	对图像应用随机透视变换，增强模型理解三维空间中物体的能力。
flipud	float	0.0	0.0 - 1.0	Flips the image upside down with the specified probability, increasing the data variability without affecting the object’s characteristics.	以指定的概率将图像上下颠倒，在不影响物体特征的情况下增加数据的多样性。
fliplr	float	0.5	0.0 - 1.0	Flips the image left to right with the specified probability, useful for learning symmetrical objects and increasing dataset diversity.	以指定的概率将图像从左向右翻转，这对于学习对称物体以及增加数据集的多样性很有用。
bgr	float	0.0	0.0 - 1.0	Flips the image channels from RGB to BGR with the specified probability, useful for increasing robustness to incorrect channel ordering.	以指定的概率将图像的通道从RGB转换为BGR，这对于增强模型对错误通道顺序的鲁棒性很有用。
mosaic	float	1.0	0.0 - 1.0	Combines four training images into one, simulating different scene compositions and object interactions. Highly effective for complex scene understanding.	将四张训练图像合并为一张，模拟不同的场景构图和物体间的相互作用。这对于理解复杂场景非常有效。
mixup	float	0.0	0.0 - 1.0	Blends two images and their labels, creating a composite image. Enhances the model’s ability to generalize by introducing label noise and visual variability.	将两张图像及其标签进行混合，创建出一张合成图像。通过引入标签噪声和视觉上的多样性，提升模型的泛化能力。
copy_paste	float	0.0	0.0 - 1.0	Copies and pastes objects across images, useful for increasing object instances and learning object occlusion. Requires segmentation labels.	在不同图像之间复制和粘贴物体，这对于增加物体实例数量以及学习物体遮挡情况很有用。此操作需要分割标签。
copy_paste_mode	str	‘flip’	-	Copy-Paste augmentation method selection among the options of (“flip”, “mixup”).	在“翻转（flip）”、“混合（mixup）”这些选项中选择复制粘贴增强方法。
auto_augment	str	‘randaugment’	-	Automatically applies a predefined augmentation policy (randaugment, autoaugment, augmix), optimizing for classification tasks by diversifying the visual features.	自动应用预定义的增强策略（随机增强（randaugment）、自动增强（autoaugment）、增强混合（augmix）），通过使视觉特征多样化来优化分类任务。
erasing	float	0.4	0.0 - 0.9	Randomly erases a portion of the image during classification training, encouraging the model to focus on less obvious features for recognition.	在分类训练过程中随机擦除图像的一部分，促使模型关注不那么明显的特征来进行识别。
crop_fraction	float	1.0	0.1 - 1.0	Crops the classification image to a fraction of its size to emphasize central features and adapt to object scales, reducing background distractions.	将用于分类的图像裁剪至其原始尺寸的一定比例，以突出图像的中心特征并适应物体的尺度大小，同时减少背景干扰。