优化算法代码部分

最新推荐文章于 2025-10-10 08:42:01 发布

原创

最新推荐文章于 2025-10-10 08:42:01 发布 · 1.8k 阅读

3 ·

CC 4.0 BY-SA版权

本文介绍了神经网络训练中常用的优化算法，包括Mini-Batch梯度下降，其通过Shuffling和Partitioning构建，通常选择2的幂作为批次大小。接着讲解了Momentum算法，结合了动量项以加速收敛，并给出了初始化和更新权重的步骤。最后，探讨了Adam优化器，它结合了RMSProp和Momentum的优点，通过计算过去梯度的指数加权平均值进行参数更新。

1、mini-bacth：

在一个简单的神经网络上的对比：

(Batch)Gradient Descent：

X = data_input
Y = labels
parameters = initialize_parameters(layers_dims)
for i in range(0, num_iterations):
    # Forward propagation
    a, caches = forward_propagation(X, parameters)
    # Compute cost.
    cost = compute_cost(a, Y)
    # Backward propagation.
    grads = backward_propagation(a, caches, parameters)
    # Update parameters.
    parameters = update_parameters(parameters, grads)

Stochastic Gradient Descent(SGD)，即每次只将m个的一个样本放入网络中训练：

X = data_input
Y = labels
parameters = initialize_parameters(layers_dims)
for i in range(0, num_iterations):
    for j in range(0, m):
        # Forward propagation
        a, caches = forward_propagation(X[:,j], parameters)
        # Compute cost
        cost = compute_cost(a, Y[:,j])
        # Backward propagation
        grads = backward_propagation(a, caches, parameters)
        # Update parameters.
        parameters = update_parameters(parameters, grads)

mini-batch：

mini-batch的两部分:

- Shuffling and Partitioning are the two steps required to build mini-batches
- Powers of two are often chosen to be the mini-batch size, e.g., 16, 32, 64, 128.

def random_mini_batches(X, Y, mini_batch_size=64, seed=0):
    """
    Creates a list of random minibatches from (X, Y)

    Arguments:
    X -- input data, of shape (input size, number of examples)
    Y -- true "label" vector (1 for blue dot / 0 for red dot), of shape (1, number of examples)
    mini_batch_size -- size of the mini-batches, integer

    Returns:
    mini_batches -- list of synchronous (mini_batch_X, mini_batch_Y)
    """

    np.random.seed(seed)  # To make your "random" minibatches the same as ours
    m = X.shape[1]  # number of training examples
    mini_batches = []

    # Step 1: Shuffle (X, Y),将m个样本的顺序随机化
    permutation = list(np.random.permutation(m))
    shuffled_X = X[:, permutation]
    shuffled_Y = Y[:, permutation].reshape((1, m))

    # Step 2: Partition (shuffled_X, shuffled_Y). Minus the end case

最低0.47元/天解锁文章