【DeepCV】超参数 Hyperparameters

最新推荐文章于 2024-01-23 12:22:56 发布

北境の守卫

最新推荐文章于 2024-01-23 12:22:56 发布

阅读量453

点赞数

CC 4.0 BY-SA版权

分类专栏： AI 文章标签： deep learning hyperparameters

本文链接：https://blog.youkuaiyun.com/baishuo8/article/details/90477959

超参数是机器学习算法中预先设定的变量，影响模型的优化过程。神经网络的超参数包括训练和模型两类。训练超参数如学习率、动量和批次大小等，而模型超参数涉及网络结构、隐藏单元数量等。设置超参数通常通过手动、搜索算法或超学习器进行。此外，预处理输入数据、权重初始化和正则化也是优化模型性能的关键因素。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Backto DeepCV

Notes from Neural Network Hyperparameters.

What is Hyperparameters?

Most machine learning algorithms involve “hyperparameters” which are variables set before actually optimizing the model’s parameters. Neural networks can have many hyperparameters, including those which specify the structure of the network itself and those which determine how the network is trained.

Setting the values of hyperparameters can be seen as model selection, i.e. choosing which model to use from the hypothesized set of possible models.

How to set Hyperparameters?

Hyperparameters are often set

by hand. experience
selected by some search algorithm, such as grid search, random search,
optimized by some “hyper-learner”. (hot topic)

Typical Hyperparameter

In particular, we will focus on feed-forward neural nets trained with mini-batch gradient descent.

Trainning Hyperparameters

learning rate: determines how quickly the gradient updates follow the gradient direction. If the learning rate is very small, the model will converge too slowly; if the learning rate is too large, the model will diverge.
Momentum: A very common technique is to “smooth” the gradient updates using a leaky integrator filter with parameter $\beta$ .
Loss Function: compares the network’s output for a training example against the intended ground truth output. A common general-purpose loss function is the squared Eclidian distance, given by $\over 2}\sum_i (y_i - z_i)^2$