!标红的参数表示调了可能对自己的训练有所帮助,其他的直接用默认值也合适了
Train Settings
- arg:model\textcolor{red}{model}model
type:str
default:None
Description:
Specifies the model file for training. Accepts a path to either a .pt
pretrained model or a .yaml configuration file. Essential for defining
the model structure or initializing weight
explain:可以指定训练用的模型,参数内容是可以是pt、yaml文件路径,也可以不指定,好像会默认加载最新的YOLO模型,比如现在是YOLOv11
sample:model.train(model=‘./weights/yolov8n_pretrained.pt’)
- arg:data\textcolor{red}{data}data
type:str
default:None
Description:
Path to the dataset configuration file (e.g., coco8.yaml). This file
contains dataset-specific parameters, including paths to training and
validation data, class names, and number of classes.
explain:指定数据配置文件,配置文件应该包括训练样本与测试样本的数据路径、每个类的名字以及标号
sample:model.train(‘./data/datasets/cowboy_outfits/cowboy_outfits.yaml’)
-
arg:epochs
type:int
default:100
很明显,不介绍了 -
arg:time
type:float
default:None
Description:
Maximum training time in hours. If set, this overrides the epochs argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
explain:训练的最大时间,如果训练到了time之后还没有结束,则停止它,这个参数适合在受GPU时间资源限制的情况下使用
- arg:patience
type:int
default:100
Description:
Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
explain:就是连续patience个迭代周期validation matrics(验证指标)都没有改善,证明模型性能已经达到平稳状态了,此时就停止训练,避免训练下去过拟合
-
arg:batch
type:int
default:16
Description:很明显,不解释了 -
arg:imgsz
type:int or list
default:640
explain:其实就是样本要统一成的图片尺寸 -
arg:save
type:bool
default:True
Description:
Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
explain:支持保存训练检查点和最终的模型权重。这对于恢复训练或进行模型部署很有用。
- arg:save_period\textcolor{red}{save\_period}save_period
type:int
default:-1
Description:
Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
explain:该参数是保存模型断点的频率 ,比如save_period=10,则每10个epochs保留一次断点。设为-1表示不启用这个功能。
- arg:cache
type:bool
default:False
Description:
Enables caching of dataset images in memory (True/ram), on disk (disk), or disables it (False). Improves training speed by reducing disk I/O at the cost of increased memory usage.
explain:是否启用缓存cache,如果启用可以加快数据读取速度,但是会增加显存cost
- arg:device\textcolor{red}{device}device
type:int or list or str
default:None
Description:
Specifies the computational device(s) for training: a single GPU (device=0), multiple GPUs (device=0,1), CPU (device=cpu), or MPS for Apple silicon (device=mps).
explain:指定训练使用的设备,device=0表示用单GPU,device=0,1表示用多GPU,device=cpu表示用cpu
- arg:workers
type:int
default:8
Description:
Number of worker threads for data loading (per RANK if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
explain:要用多少个进程加载数据,越多的话加载数据越快,在多GPU训练中尤为有效
- arg:project
type:str
default:None
Description:
Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.
explain:训练的输出保存到的路径,一般好像是‘runs’文件夹
- arg:name\textcolor{red}{name}name
type:str
default:None
Description:
Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
explain:指定这次训练是什么名字,可以方便找到对应训练的日志输出,默认应该是‘train’吧,如下:
- arg:exist_ok\textcolor{red}{exist\_ok}exist_ok
type:bool
default:False
Description:
If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
explain:如果已存在名字与这次训练一致的日志输出,设exist_ok=True可以重写日志,否则另开新的日志来写,只需要保留最新一次训练的日志输出可以设为True。
- arg:pretrained
type:bool or str
default:True
Description:
Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
explain:决定是否用预训练模型来训练,我感觉跟第一个参数model可能会重叠,因为model参数也可以让模型拥有预训练权重,此时pretrained参数还有意义吗
- arg:optimizer
type:str
default:‘auto’
Description:
Choice of optimizer for training. Options include SGD, Adam, AdamW, NAdam, RAdam, RMSProp etc., or auto for automatic selection based on model configuration. Affects convergence speed and stability.
explain:优化器,有Adam、SGD多种选择,或者设为‘auto’,此时框架自动根据模型结构选择较合适的优化器,但此时的lr、momentum参数也会自动设置,所以下边的这两个参数手动设置也没用了。
- arg:seed
type:int
default:0
Description:
Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
explain:设置随机种子(应该是作用于数据增广等处),保证相同的配置可以让结果复现。
- arg:deterministic
type:bool
default:True
Description:
Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
explain:强制使用确定性算法,保证了可复现性,但是可能会影响性能以及速度。
- arg:single_cls
type:bool
default:False
Description:
Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
explain:将所有类视为一类,进化成二分类问题,如果只需要识别到物体,而不关心物体是什么,则可以用这个选项。
- arg:classes
type:list[int]
default:None
Description:
Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training.
explain:只关心某些类的识别,不关心其他的
- arg:rect
type:bool
default:False
Description:
Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
explain:允许矩阵训练,就是为每个批次寻找最相似的尺寸,这样可以是他们resize到该尺寸时填充最少,该选项可以改善效率、速度,但是可能会影响精确率。
- arg:multi_scale\textcolor{red}{multi\_scale}multi_scale
type:bool
default:False
Description:
Enables multi-scale training by increasing/decreasing imgsz by up to a factor of 0.5 during training. Trains the model to be more accurate with multiple imgsz during inference.
explain:通过在训练期间将图像尺寸(imgsz)增大或减小最多 0.5 倍来启用多尺度训练。这样能训练模型,使其在推理阶段对于多种图像尺寸具有更高的准确性。
- arg:cos_lr\textcolor{red}{cos\_lr}cos_lr
type:bool
default:False
Description:
Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
explain:是否采用余弦学习率调度器,采用的话可能更容易收敛
- arg:close_mosaic
type:int
default:10
Description:
Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
explain:在最后 N 个训练轮次(epochs)中禁用马赛克数据增强,以在训练结束前稳定训练过程。若将该参数设置为 0,则禁用此功能。
- arg:resume\textcolor{red}{resume}resume
type:bool
default:False
Description:
Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
explain:从最后保存的检查点恢复训练。它会自动加载模型权重、优化器状态和已训练的轮数(epoch数),从而无缝地继续训练过程。
- arg:amp
type:bool
default:True
Description:
Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy
explain:是否采取混合精度,用的话可以加速训练,但有一点点可能会影响精度
- arg:fraction
type:float
default:1.0
Description:
Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.
explain:指定用于训练的数据集比例。这允许在完整数据集的一个子集上进行训练,在进行实验或者资源有限的情况下非常有用。
- arg:profile
type:bool
default:False
Description:
Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.
explain:在训练期间启用对 ONNX(开放神经网络交换格式)和 TensorRT(英伟达的深度学习推理优化器)推理速度的分析,这对于优化模型部署很有帮助。
- arg:freeze
type:int or list
default:None
Description:
Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.
explain:冻结模型的前 N 层或通过索引指定的层,从而减少可训练参数的数量。这在微调模型或进行迁移学习时非常有用。
- arg:lr0
type:float
default:0.01
Description:
Initial learning rate (i.e. SGD=1E-2, Adam=1E-3). Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.
explain:初始的学习率
- arg:lrf
type:float
default:0.01
Description:
Final learning rate as a fraction of the initial rate = (lr0 * lrf), used in conjunction with schedulers to adjust the learning rate over time.
explain:最终学习率是初始学习率的一个比例,即等于(初始学习率 lr0
乘以最终学习率因子 lrf
),它与学习率调度器配合使用,以便随着时间推移来调整学习率。
- arg:momentum
type:float
default:0.937
Description:
Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.
explain:对于随机梯度下降(SGD)优化器而言,它是动量因子;对于自适应矩估计(Adam)优化器来说,它是 beta1 参数,该参数会影响在当前更新中对过往梯度的整合情况。 动量能够对噪声进行平滑处理,因为它会综合考虑过去多个步骤的梯度信息,而不是仅仅依赖当前的梯度,可以让梯度更新更稳定
- arg:weight_decay
type:float
default:0.0005
Description:
L2 regularization term, penalizing large weights to prevent overfitting.
explain:添加正则项,避免过拟合
- arg:warmup_epochs
type:float
default:3.0
Description:
Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.
explain:学习率预热(warmup)的轮数(epoch数),即从一个较低的值逐渐将学习率提升到初始学习率,以便在训练初期稳定训练过程。
- arg:warmup_momentum
type:float
default:0.8
Description:
Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.
explain:预热阶段的初始动量,在预热期间逐渐调整到设定的动量值。
- arg:warmup_bias_lr
type:float
default:0.1
Description:
Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.
explain:预热阶段偏置参数的学习率,有助于在训练初始轮次稳定模型训练。
- arg:box
type:float
default:7.5
Description:
Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.
explain:损失函数中边界框损失项的权重,它影响着模型在准确预测边界框坐标方面所给予的重视程度。
- arg:cls
type:float
default:0.5
Description:
Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.
explain:总损失函数中分类损失的权重,影响正确类别预测相对于其他损失分量的重要性。
- arg:dfl
type:float
default:1.5
Description:
Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.
explain:分布焦点损失(distribution focal loss)的权重,在某些版本的 YOLO 中用于进行细粒度的分类任务。
- arg:pose
type:float
default:12.0
Description:
Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.
explain:用于姿态估计,在进行姿态估计训练的模型中,姿态损失的权重会影响模型对精确预测姿态关键点的重视程度。
- arg:kobj
type:float
default:2.0
Description:
Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.
explain:在姿态估计模型中,关键点目标性损失的权重,用于平衡检测置信度与姿态准确性。
- arg:nbs
type:int
default:64
Description:
Nominal batch size for normalization of loss.
explain:用于对损失进行归一化处理的标称批量大小。 累积到64个样本才开始计算损失,可以让更新更平滑。
- arg:overlap_mask
type:bool
default:True
Description:
Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlaid on top of the larger mask during merge.
explain:用于实例分割,用于确定在训练时是否应将目标实例的掩码合并为一个单一掩码,还是对每个目标分别保留其独立的掩码。如果存在掩码重叠的情况,在进行合并时,较小的掩码会覆盖在较大的掩码之上。
- arg:mask_ratio
type:int
default:4
Description:
Downsample ratio for segmentation masks, affecting the resolution of masks used during training.
explain:分割掩码的下采样比率,它会影响训练过程中所使用的掩码的分辨率。
- arg:dropout
type:float
default:0.0
Description:
Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.
explain:在分类任务中用于正则化的随机失活(Dropout)率,通过在训练过程中随机忽略(丢弃)神经元来防止过拟合。
- arg:val\textcolor{red}{val}val
type:bool
default:True
Description:
Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.
explain:在训练过程中启用验证功能,使得能够在一个独立的数据集上定期评估模型的性能。
- arg:plots\textcolor{red}{plots}plots
type:bool
default:False
Description:
Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.
explain:生成并保存训练和验证指标的图表,以及预测示例,为模型的性能表现和学习进展提供可视化的洞察。
Augmentation Settings and Hyperparameters
!一般不用改,看看每个参数是什么意思就好
Argument | Type | Default | Range | Description | Explain |
---|---|---|---|---|---|
hsv_h | float | 0.015 | 0.0 - 1.0 | Adjusts the hue of the image by a fraction of the color wheel, introducing color variability. Helps the model generalize across different lighting conditions. | 通过色轮的一定比例来调整图像的色调,引入颜色的多样性。这有助于模型在不同的光照条件下实现泛化。 |
hsv_s | float | 0.7 | 0.0 - 1.0 | Alters the saturation of the image by a fraction, affecting the intensity of colors. Useful for simulating different environmental conditions. | 按一定比例改变图像的饱和度,影响颜色的鲜艳程度。这对于模拟不同的环境条件很有用。 |
hsv_v | float | 0.4 | 0.0 - 1.0 | Modifies the value (brightness) of the image by a fraction, helping the model to perform well under various lighting conditions. | 按一定比例修改图像的明度(亮度),帮助模型在各种光照条件下都能有良好的表现。 |
degrees | float | 0.0 | -180 - +180 | Rotates the image randomly within the specified degree range, improving the model’s ability to recognize objects at various orientations. | 在指定的度数范围内随机旋转图像,提升模型识别处于各种方向的物体的能力。 |
translate | float | 0.1 | 0.0 - 1.0 | Translates the image horizontally and vertically by a fraction of the image size, aiding in learning to detect partially visible objects. | 将图像在水平和垂直方向上按照图像尺寸的一定比例进行平移,有助于学习检测部分可见的物体。 |
scale | float | 0.5 | >=0.0 | Scales the image by a gain factor, simulating objects at different distances from the camera. | 通过一个增益因子对图像进行缩放,模拟物体与相机处于不同距离时的情况。 |
shear | float | 0.0 | -180 - +180 | Shears the image by a specified degree, mimicking the effect of objects being viewed from different angles. | 按照指定的角度对图像进行错切变换,模拟从不同角度观察物体的效果。 |
perspective | float | 0.0 | 0.0 - 0.001 | Applies a random perspective transformation to the image, enhancing the model’s ability to understand objects in 3D space. | 对图像应用随机透视变换,增强模型理解三维空间中物体的能力。 |
flipud | float | 0.0 | 0.0 - 1.0 | Flips the image upside down with the specified probability, increasing the data variability without affecting the object’s characteristics. | 以指定的概率将图像上下颠倒,在不影响物体特征的情况下增加数据的多样性。 |
fliplr | float | 0.5 | 0.0 - 1.0 | Flips the image left to right with the specified probability, useful for learning symmetrical objects and increasing dataset diversity. | 以指定的概率将图像从左向右翻转,这对于学习对称物体以及增加数据集的多样性很有用。 |
bgr | float | 0.0 | 0.0 - 1.0 | Flips the image channels from RGB to BGR with the specified probability, useful for increasing robustness to incorrect channel ordering. | 以指定的概率将图像的通道从RGB转换为BGR,这对于增强模型对错误通道顺序的鲁棒性很有用。 |
mosaic | float | 1.0 | 0.0 - 1.0 | Combines four training images into one, simulating different scene compositions and object interactions. Highly effective for complex scene understanding. | 将四张训练图像合并为一张,模拟不同的场景构图和物体间的相互作用。这对于理解复杂场景非常有效。 |
mixup | float | 0.0 | 0.0 - 1.0 | Blends two images and their labels, creating a composite image. Enhances the model’s ability to generalize by introducing label noise and visual variability. | 将两张图像及其标签进行混合,创建出一张合成图像。通过引入标签噪声和视觉上的多样性,提升模型的泛化能力。 |
copy_paste | float | 0.0 | 0.0 - 1.0 | Copies and pastes objects across images, useful for increasing object instances and learning object occlusion. Requires segmentation labels. | 在不同图像之间复制和粘贴物体,这对于增加物体实例数量以及学习物体遮挡情况很有用。此操作需要分割标签。 |
copy_paste_mode | str | ‘flip’ | - | Copy-Paste augmentation method selection among the options of (“flip”, “mixup”). | 在“翻转(flip)”、“混合(mixup)”这些选项中选择复制粘贴增强方法。 |
auto_augment | str | ‘randaugment’ | - | Automatically applies a predefined augmentation policy (randaugment, autoaugment, augmix), optimizing for classification tasks by diversifying the visual features. | 自动应用预定义的增强策略(随机增强(randaugment)、自动增强(autoaugment)、增强混合(augmix)),通过使视觉特征多样化来优化分类任务。 |
erasing | float | 0.4 | 0.0 - 0.9 | Randomly erases a portion of the image during classification training, encouraging the model to focus on less obvious features for recognition. | 在分类训练过程中随机擦除图像的一部分,促使模型关注不那么明显的特征来进行识别。 |
crop_fraction | float | 1.0 | 0.1 - 1.0 | Crops the classification image to a fraction of its size to emphasize central features and adapt to object scales, reducing background distractions. | 将用于分类的图像裁剪至其原始尺寸的一定比例,以突出图像的中心特征并适应物体的尺度大小,同时减少背景干扰。 |