神经网络参数与TensorFlow变量

最新推荐文章于 2022-10-08 11:00:19 发布

原创最新推荐文章于 2022-10-08 11:00:19 发布 · 212 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#TensorFlow

深度学习专栏收录该内容

6 篇文章

订阅专栏

本文深入探讨了使用TensorFlow框架构建神经网络的过程，包括变量声明、随机数与常数生成函数、前向传播算法实现及神经网络参数优化。通过实例演示了如何训练神经网络模型，涵盖了数据输入、损失函数定义、反向传播算法应用等关键步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

TensorFlow实战Google深度学习框架学习笔记

import tensorflow as tf
# TensorFlow声明一个矩阵变量的方法
weights = tf.Variable(tf.random_normal([2,3],stddev = 2))

TensorFlow 随机数生成函数

函数名称	随机数分布	主要参数
tf.random_normal	正态分布	平均值，标准差，取值类型
tf.truncated_normal	正态分布，但如果选出来的随机数均值超过两个标准差	平均值，标准差，取值类型
tf.random_uniform	均匀分布	最大，最小取值，取值类型
tf.random_gamma	Gamma分布	形状参数alpha,尺度参数beta,取值类型

TensorFlow常数生成函数/类似于numpy

函数名称	功能	样例
tf.zeros	产生全0的数组	tf.zeros([2,3],int32) -> [[0,0,0],[0,0,0]]
tf.ones	产生全1的数组	tf.ones([2,3],int32) -> [[1,1,1],[1,1,1]]
tf.fill	产生全部为给定数字的数组	tf.fill([2,3],9) -> [[9,9,9],[9,9,9]]
tf.constant	产生一个给定值的常量	tf.constant([1,2,3]) -> [1,2,3]

# 偏置项一般用常数来设置,[0,0,0]
biases = tf.Variable(tf.zeros([3]))

# w2的初始值与weights相同的变量
# w3的初始值是weights的二倍
w2 = tf.Variable(weights.initialized_value())
w3 = tf.Variable(weights.initialized_value() * 2.0)

# 通过变量实现神经网络参数并实现向前传播的过程
# 声明w1,w2两个变量。
w1 = tf.Variable(tf.random_normal((2,3), stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal((3,1), stddev = 1, seed = 1))

# 暂时将输入的特征向量定义为一个常量。x是一个1*2的矩阵。
x = tf.constant([[0.7,0.9]])

#通过前向传播算法获得神经网络输出

a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

sess = tf.Session()
sess.run(w1.initializer)  #初始化w1
sess.run(w2.initializer)   #初始化w2
print(sess.run(y))
sess.close()

[[ 3.95757794]]

#初始化所有变量
sess = tf.Session()
init_op = tf.global_variables_initializer()
sess.run(init_op)
sess.close()

TensorFlow中的变量

变量是一种特殊的张量
TensorFlow中所有的变量会被加入GraphKeys.VARIABLES集合中。
trainable参数用来区分需要优化的参数。

trainable为True,变量为需要优化的参数。这个变量会被加入到GraphKeys.TRAINABLE_VARIABLES集合。

维度和类型是变量的最重要的两个属性。

变量的类型是不可以改变的
变量的维度是可以改变的

w1  = tf.Variable(tf.random_normal([2,3],stddev = 1), name = "w1")
#w2 = tf.Variable(tf.random_normal([2,3],dtype = tf.float64, stddev = 1),name = "w2")
#w1.assign(w2)

'''
程序会报错：类型不匹配。
TypeError: Input 'value' of 'Assign' Op has type float64 that does not match type float32 of argument 'ref'.
'''

"\n程序会报错：类型不匹配。\nTypeError: Input 'value' of 'Assign' Op has type float64 that does not match type float32 of argument 'ref'.\n"

w1 = tf.Variable(tf.random_normal([2,3], stddev = 1), name = "w1")
w2 = tf.Variable(tf.random_normal([2,2], stddev = 1), name = "w2")
#下面这句会报维度不匹配
'''
ValueError: Dimension 1 in both shapes must be equal, but are 3 and 2 for 'Assign_1' (op: 'Assign') with input shapes: [2,3], [2,2].
'''
# tf.assign(w1,w2)
#这一句可以被成功执行
tf.assign(w1, w2, validate_shape = False)

<tf.Tensor 'Assign_2:0' shape=(2, 2) dtype=float32_ref>

通过TensorFlow训练神经网络模型

# 通过placeholder实现前向传播算法
w1 = tf.Variable(tf.random_normal([2,3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3,1], stddev = 1, seed = 1))

#定义placeholder作为存放输入数据的地方。不一定要定义维度。
#但如果维度确定的，那么给出维度可以降低出错率
x = tf.placeholder(tf.float32, shape = (3,2), name = "input")
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

sess = tf.Session()
init_op = tf.global_variables_initializer()
sess.run(init_op)
#如果不指定placeholder的取值，那么运行时将会报错
#print(sess.run(y))

# print(sess.run(y, feed_dict = {x:[[0.7,0.9]]}))  输出：[[ 3.95757794]]，x的shape为（1,2）
print(sess.run(y, feed_dict = {x : [[0.7,0.9],[0.1,0.4],[0.5,0.8]]})) 
'''
输出：
[[ 3.95757794]
 [ 1.15376544]
 [ 3.16749239]]
'''

[[ 3.95757794]
 [ 1.15376544]
 [ 3.16749239]]

'\n输出：\n[[ 3.95757794]\n [ 1.15376544]\n [ 3.16749239]]\n'

from numpy.random import RandomState

#定义训练数据batch的大小
batch_size = 8

#定义神经网络参数。
w1 = tf.Variable(tf.random_normal([2,3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3,1], stddev = 1, seed = 1))

x = tf.placeholder(tf.float32, shape = (None, 2), name = 'x-input')
y_ = tf.placeholder(tf.float32, shape = (None, 1), name = 'y-input')

#定义神经网络的前向传播过程

a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

# 反向传播算法
y = tf.sigmoid(y)  # 使用sigmoid()函数将y转换为0~1的数值。y代表预测是正样本的概率，1-y代表预测是负样本的概率

#定义损失函数来刻画预测值与真实值的差距
cross_entropy = -tf.reduce_mean( y_ * tf.log(tf.clip_by_value(y, 1e-10, 1.0)) + (1 - y_) \
                                * tf.log(tf.clip_by_value(1-y, 1e-10, 1.0)))

#定义学习率
learning_rate = 0.001
#定义反向传播算法用来优化神经网中的参数
train_step = tf.train.AdamOptimizer(learning_rate).minimize(cross_entropy)

    
#通过一个随机数生成一个数据模拟器
rdm = RandomState(1)
dataset_size = 128
X = rdm.rand(dataset_size, 2)
Y = [[int(x1+x2 < 1)] for (x1,x2) in X]

#创建一个会话来运行TensorFlow程序
with tf.Session() as sess:
    init_op = tf.global_variables_initializer()
    #初始化变量
    sess.run(init_op)
    
    print(sess.run(w1))
    print(sess.run(w2))
    
    #设定训练的轮数
    STEPS = 5000
    for i in range(STEPS):
        start = (i * batch_size) % dataset_size
        end = min(start+batch_size, dataset_size)
        
        sess.run(train_step, feed_dict = {x: X[start:end],y_: Y[start:end]})
        
        if i % 1000 == 0:
            #每个一段时间计算所所有数据的交叉熵并输出
            total_cross_entropy = sess.run(cross_entropy, feed_dict = {x: X, y_ : Y})
            print("After %d training step(s), cross entropy onn all data is %g" %(i, total_cross_entropy))
    print(sess.run(w1))
    print(sess.run(w2))

[[-0.81131822  1.48459876  0.06532937]
 [-2.4427042   0.0992484   0.59122431]]
[[-0.81131822]
 [ 1.48459876]
 [ 0.06532937]]
After 0 training step(s), cross entropy onn all data is 1.89805
After 1000 training step(s), cross entropy onn all data is 0.655075
After 2000 training step(s), cross entropy onn all data is 0.626172
After 3000 training step(s), cross entropy onn all data is 0.615096
After 4000 training step(s), cross entropy onn all data is 0.610309
[[ 0.02476984  0.5694868   1.69219422]
 [-2.19773483 -0.23668921  1.11438966]]
[[-0.45544702]
 [ 0.49110931]
 [-0.9811033 ]]