2023.4.11 tensorflow学习记录（基本概念与函数）_tensorflow如何区分w参数和b参数-优快云博客

本文链接：https://blog.youkuaiyun.com/qq_43668547/article/details/130091202

文章介绍了损失函数的概念，它是衡量预测值与真实值差距的指标。接着讨论了梯度下降法，这是一种寻找损失函数最小值的优化方法。学习率在梯度下降中起到关键作用，控制着参数更新的速度。反向传播用于计算损失函数对网络参数的偏导数，以便更新参数。最后，文章提供了几个TensorFlow创建张量的例子。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

损失函数（loss function）：预测值（y）与标准答案（y_）的差距。损失函数可以定量判断W、b的优劣，当损失函数输出最小时，参数W、b会出现最优值。

目的：想找到一组参数w和b，使得损失函数最小。

梯度：函数对各参数求偏导后的向量。函数梯度下降方向是函数减小方向。

梯度下降法：沿损失函数梯度下降的方向，寻找损失函数的最小值，得到最优 参数的方法。

学习率（learning rate，lr）：当学习率设置的过小时，收敛过程将变得十 分缓慢。而当学习率设置的过大时，梯度可能会在最小值附近来回震荡， 甚至可能无法收敛。

反向传播：从后向前，逐层求损失函数对每层神经元参数 的偏导数，迭代更新所有参数。

import tensorflow as tf

w = tf.Variable(tf.constant(5, dtype=tf.float32))
lr = 0.2
epoch = 40

for epoch in range(epoch):  # for epoch 定义顶层循环，表示对数据集循环epoch次，此例数据集数据仅有1个w,初始化时候constant赋值为5，循环40次迭代。
    with tf.GradientTape() as tape:  # with结构到grads框起了梯度的计算过程。
        loss = tf.square(w + 1)
    grads = tape.gradient(loss, w)  # .gradient函数告知谁对谁求导

    w.assign_sub(lr * grads)  # .assign_sub 对变量做自减 即：w -= lr*grads 即 w = w - lr*grads
    print("After %s epoch,w is %f,loss is %f" % (epoch, w.numpy(), loss))

创建一个张量

import tensorflow as tf

a = tf.constant([1, 5], dtype=tf.int64)
print("a:", a)
print("a.dtype:", a.dtype)
print("a.shape:", a.shape)

将numpy的数据类型转换为Tensor数据类型

import tensorflow as tf
import numpy as np

a = np.arange(0, 5)
b = tf.convert_to_tensor(a, dtype=tf.int64)
print("a:", a)
print("b:", b)

创建全为指定值的张量

import tensorflow as tf

a = tf.zeros([2, 3])
b = tf.ones(4)
c = tf.fill([2, 2], 9)
print("a:", a)
print("b:", b)
print("c:", c)

生成正态分布的随机数，默认均值为0，标准差为1 tf. random.normal ( 维

生成截断式正态分布的随机数 tf. random.truncated_normal (维度，mean=均值，stddev=标准差)

import tensorflow as tf

d = tf.random.normal([2, 2], mean=0.5, stddev=1)
print("d:", d)
e = tf.random.truncated_normal([2, 2], mean=0.5, stddev=1)
print("e:", e)