(TensorFlow)——tf.variable_scope和tf.name_scope详解

本文详细探讨了TensorFlow中变量的创建方式及其区别,包括tf.placeholder、tf.Variable和tf.get_variable,并通过实验对比了它们在命名冲突、变量共享等方面的特性。此外,还介绍了如何利用name_scope和variable_scope对变量进行有效的管理和共享。

1、variable_scope和name_scope存在的价值:
和普通模型相比,深度学习模型的节点(参数)非常多,我们很难确定哪个变量属于哪层。为了解决此问题,所以引入了name_scope和variable_scope,两者分别承担着不同的责任:
*name_scope*: 为了更好的管理变量的命名空间。
*variable_scope*:绝大部分情况下会和tf.get_variable()配合使用,实现变量共享的功能。

2、实验一:
三种方式创建变量:tf.placeholder、tf.Variable、tf.get_variable
(1)实验目的:探索三种方式定义的变量之间的区别:

import tensorflow as tf
# 设置GPU按需增长
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)
# 1.placeholder 
v1 = tf.placeholder(tf.float32, shape=[2,3,4])
print v1.name
v1 = tf.placeholder(tf.float32, shape=[2,3,4], name='ph')
print v1.name
v1 = tf.placeholder(tf.float32, shape=[2,3,4], name='ph')
print v1.name
print type(v1)
print v1
Placeholder:0
ph:0
ph_1:0
<class 'tensorflow.python.framework.ops.Tensor'>
Tensor("ph_1:0", shape=(2, 3, 4), dtype=float32)
# 2. tf.Variable()
v2 = tf.Variable([1,2], dtype=tf.float32)
print v2.name
v2 = tf.Variable([1,2], dtype=tf.float32, name='V')
print v2.name
v2 = tf.Variable([1,2], dtype=tf.float32, name='V')
print v2.name
print type(v2)
print v2
Variable:0
V:0
V_1:0
<class 'tensorflow.python.ops.variables.Variable'>
Tensor("V_1/read:0", shape=(2,), dtype=float32)
# 3.tf.get_variable() 创建变量的时候必须要提供 name
v3 = tf.get_variable(name='gv', shape=[])  
print v3.name
v4 = tf.get_variable(name='gv', shape=[2])
print v4.name
gv:0
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-7-29efaac2d76c> in <module>()
      2 v3 = tf.get_variable(name='gv', shape=[])
      3 print v3.name
----> 4 v4 = tf.get_variable(name='gv', shape=[2])
      5 print v4.name
此处还有一堆错误信息。。。
ValueError: Variable gv already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
...
print type(v3)
print v3
<class 'tensorflow.python.ops.variables.Variable'>
Tensor("gv/read:0", shape=(), dtype=float32)

接下来要介绍tf.trainable_variables(),他能够将我们定义的所有trainable=True的所有变量以一个list形式返回,我们用他来查看之前所定义的placeholder、variable、get_variable之间的区别:

vs = tf.trainable_variables()
print len(vs)
for v in vs:
    print v
4
Tensor("Variable/read:0", shape=(2,), dtype=float32)
Tensor("V/read:0", shape=(2,), dtype=float32)
Tensor("V_1/read:0", shape=(2,), dtype=float32)
Tensor("gv/read:0", shape=(), dtype=float32)

(2)实验一结论:
我们发现输出结果中没有placeholder,因为它是占位符,当然是不可训练的。
只有tf.get_variable()创建的变量之间会发生命名冲突,在实际使用过程中,三种创建变量方式的用途也是分工明确的:

tf.placeholder()占位符。 *trainable=False*
tf.Variable()一般变量使用这种定义方式 *可以选择trainable类型*
tf.get_variable()一般都是和tf.variable_scope()配合使用。 *可以选择trainable类型*

3、实验二

with tf.name_scope('nsc1'):
    v1 = tf.Variable([1], name='v1')
    with tf.variable_scope('vsc1'):
        v2 = tf.Variable([1], name='v2')
        v3 = tf.get_variable(name='v3', shape=[])
print 'v1.name: ', v1.name
print 'v2.name: ', v2.name
print 'v3.name: ', v3.name
v1.name:  nsc1/v1:0
v2.name:  nsc1/vsc1/v2:0
v3.name:  vsc1/v3:0
with tf.name_scope('nsc1'):
    v4 = tf.Variable([1], name='v4')
print 'v4.name: ', v4.name
v4.name:  nsc1_1/v4:0

tf.name_scope() 并不会对 tf.get_variable() 创建的变量有任何影响。
tf.name_scope() 主要是用来管理命名空间的,这样子让我们的整个模型更加有条理。而 tf.variable_scope() 的作用是为了实现变量共享,它和 tf.get_variable() 来完成变量共享的功能。

1、第一组,用tf.Variable()的方式定义:

def my_image_filter():
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    return None

# First call creates one set of 4 variables.
result1 = my_image_filter()
# Another set of 4 variables is created in the second call.
result2 = my_image_filter()
# 获取所有的可训练变量
vs = tf.trainable_variables()
print 'There are %d train_able_variables in the Graph: ' % len(vs)
for v in vs:
    print v
There are 8 train_able_variables in the Graph: 
Tensor("conv1_weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv1_biases/read:0", shape=(32,), dtype=float32)
Tensor("conv2_weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv2_biases/read:0", shape=(32,), dtype=float32)
Tensor("conv1_weights_1/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv1_biases_1/read:0", shape=(32,), dtype=float32)
Tensor("conv2_weights_1/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv2_biases_1/read:0", shape=(32,), dtype=float32)

2.第二种方式,用 tf.get_variable() 的方式

# 下面是定义一个卷积层的通用方式
def conv_relu(kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape, initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape, initializer=tf.constant_initializer(0.0))
    return None


def my_image_filter():
    # 按照下面的方式定义卷积层,非常直观,而且富有层次感
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu([5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu( [5, 5, 32, 32], [32])


with tf.variable_scope("image_filters") as scope:
    # 下面我们两次调用 my_image_filter 函数,但是由于引入了 变量共享机制
    # 可以看到我们只是创建了一遍网络结构。
    result1 = my_image_filter()
    scope.reuse_variables()
    result2 = my_image_filter()


# 看看下面,完美地实现了变量共享!!!
vs = tf.trainable_variables()
print 'There are %d train_able_variables in the Graph: ' % len(vs)
for v in vs:
    print v
There are 4 train_able_variables in the Graph: 
Tensor("image_filters/conv1/weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("image_filters/conv1/biases/read:0", shape=(32,), dtype=float32)
Tensor("image_filters/conv2/weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("image_filters/conv2/biases/read:0", shape=(32,), dtype=float32)

(2)实验二结论:
首先我们要确立一种Graph的思想。在TensorFlow中,我们定义一个变量,相当于往图中增加了一个节点。和普通的Python程序不同,在一般的函数中,我们对输入进行处理,然后返回一个结果,而定义在函数里面的一些局部变量我们就不管了。但是在tensorflow中我们在函数中创建了一个变量,就是往Graph中添加了一个节点,即使出了这个函数,这个节点还是存在于Graph中。

Ref:
https://blog.youkuaiyun.com/Jerr__y/article/details/70809528

import numpy as np import random import tensorflow as tf import wx unit = 80 # 一个方格所占像素 maze_height = 4 # 迷宫高度 maze_width = 4 # 迷宫宽度 class Maze(wx.Frame): def __init__(self, parent): # +16+39为了适配客户端大小 super(Maze, self).__init__(parent, title='maze', size=(maze_width*unit+16, maze_height*unit+39)) self.actions = ['up', 'down', 'left', 'right'] self.n_actions = len(self.actions) # 按照此元组绘制坐标 self.coordinate = (0, 0) self.rl = DeepQNetwork(4, 2) self.generator = self.rl.RL_Q_network() # 使用EVT_TIMER事件timer类可以实现间隔多长时间触发事件 self.timer = wx.Timer(self) # 创建定时器 self.timer.Start(5) # 设定时间间隔 self.Bind(wx.EVT_TIMER, self.build_maze, self.timer) # 绑定一个定时器事件 self.Show(True) def build_maze(self, event): # yield在生成器运行结束后再次调用会产生StopIteration异常, # 使用try_except语句避免出现异常并在异常出现(程序运行结束)时关闭timer try: self.generator.send(None) # 调用生成器更新位置 except Exception: self.timer.Stop() self.coordinate = self.rl.status dc = wx.ClientDC(self) self.draw_maze(dc) def draw_maze(self, dc): dc.SetBackground(wx.Brush('white')) dc.Clear() for row in range(0, maze_height*unit+1, unit): x0, y0, x1, y1 = 0, row, maze_height*unit, row dc.DrawLine(x0, y0, x1, y1) for col in range(0, maze_width*unit+1, unit): x0, y0, x1, y1 = col, 0, col, maze_width*unit dc.DrawLine(x0, y0, x1, y1) dc.SetBrush(wx.Brush('black')) dc.DrawRectangle(2*unit+10, unit+10, 60, 60) dc.SetBrush(wx.Brush('yellow')) dc.DrawRectangle(2*unit+10, 2*unit+10, 60, 60) dc.SetBrush(wx.Brush('red')) dc.DrawCircle((self.coordinate[0]+0.5)*unit, (self.coordinate[1]+0.5)*unit, 30) class DeepQNetwork(object): def __init__(self, n_actions, n_features, # 状态的属性个数(2,横坐标纵坐标) learning_rate=0.01, reward_decay=0.9, # gamma epsilon_greedy=0.9, # epsilon replace_target_iter=300, # 更新target网络的间隔步数 buffer_size=500, # 样本缓冲区 batch_size=32, ): self.n_actions = n_actions self.n_features = n_features self.lr = learning_rate self.gamma = reward_decay self.epsilon_max = epsilon_greedy self.replace_target_iter = replace_target_iter self.buffer_size = buffer_size self.buffer_counter = 0 # 统计目前进入过buffer的数量 self.batch_size = batch_size self.epsilon = epsilon_greedy self.max_episode = 300 self.status = (0, 0) # 用于记录在运行过程中的当前位置,然后提供给Maze对象 self.learn_step_counter = 0 # 学习计步器 self.buffer = np.zeros((self.buffer_size, n_features*2+2)) # 初始化Experience buffer[s,a,r,s_] self.build_net() # 将eval网络中参数全部更新到target网络 target_params = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='target_net') eval_params = tf.get_collection(tf.GraphKeys.GLOBAL_VARIABLES, scope='eval_net') with tf.variable_scope('soft_replacement'): self.target_replace_op = [tf.assign(t, e) for t, e in zip(target_params, eval_params)] self.sess = tf.Session() tf.summary.FileWriter('logs/', self.sess.graph) self.sess.run(tf.global_variables_initializer()) def build_net(self): self.s = tf.placeholder(tf.float32, [None, self.n_features]) self.s_ = tf.placeholder(tf.float32, [None, self.n_features]) self.r = tf.placeholder(tf.float32, [None, ]) self.a = tf.placeholder(tf.int32, [None, ]) w_initializer = tf.random_normal_initializer(0., 0.3) b_initializer = tf.constant_initializer(0.1) # q_eval网络架构,输入状态属性,输出4种动作 with tf.variable_scope('eval_net'): eval_layer = tf.layers.dense(self.s, 20, tf.nn.relu, kernel_initializer=w_initializer, bias_initializer=b_initializer, name='eval_layer') self.q_eval = tf.layers.dense(eval_layer, self.n_actions, kernel_initializer=w_initializer, bias_initializer=b_initializer, name='output_layer1') with tf.variable_scope('target_net'): target_layer = tf.layers.dense(self.s_, 20, tf.nn.relu, kernel_initializer=w_initializer, bias_initializer=b_initializer, name='target_layer') self.q_next = tf.layers.dense(target_layer, self.n_actions, kernel_initializer=w_initializer, bias_initializer=b_initializer, name='output_layer2') with tf.variable_scope('q_target'): # 计算期望价值,并使用stop_gradient函数将其不计算梯度,也就是当做常数对待 self.q_target = tf.stop_gradient(self.r + self.gamma * tf.reduce_max(self.q_next, axis=1)) with tf.variable_scope('q_eval'): # 将a的值对应起来, a_indices = tf.stack([tf.range(tf.shape(self.a)[0]), self.a], axis=1) self.q_eval_a = tf.gather_nd(params=self.q_eval, indices=a_indices) with tf.variable_scope('loss'): self.loss = tf.reduce_mean(tf.squared_difference(self.q_target, self.q_eval_a)) with tf.variable_scope('train'): self.train_op = tf.train.RMSPropOptimizer(self.lr).minimize(self.loss) # 存储训练数据 def store_transition(self, s, a, r, s_): transition = np.hstack((s, a, r, s_)) index = self.buffer_counter % self.buffer_size self.buffer[index, :] = transition self.buffer_counter += 1 def choose_action_by_epsilon_greedy(self, status): status = status[np.newaxis, :] if random.random() < self.epsilon: actions_value = self.sess.run(self.q_eval, feed_dict={self.s: status}) action = np.argmax(actions_value) else: action = np.random.randint(0, self.n_actions) return action def learn(self): # 每学习self.replace_target_iter步,更新target网络的参数 if self.learn_step_counter % self.replace_target_iter == 0: self.sess.run(self.target_replace_op) # 从Experience buffer中选择样本 sample_index = np.random.choice(min(self.buffer_counter, self.buffer_size), size=self.batch_size) batch_buffer = self.buffer[sample_index, :] _, cost = self.sess.run([self.train_op, self.loss], feed_dict={ self.s: batch_buffer[:, :self.n_features], self.a: batch_buffer[:, self.n_features], self.r: batch_buffer[:, self.n_features+1], self.s_: batch_buffer[:, -self.n_features:] }) self.learn_step_counter += 1 return cost def get_environment_feedback(self, s, action_name): is_terminal = False if action_name == 0: # up if s == (2, 3): r = 1 is_terminal = True else: r = 0 s_ = (s[0], np.clip(s[1]-1, 0, 3)) elif action_name == 1: # down if s == (2, 0): r = -1 is_terminal = True else: r = 0 s_ = (s[0], np.clip(s[1]+1, 0, 3)) elif action_name == 2: # left if s == (3, 1): r = -1 is_terminal = True elif s == (3, 2): r = 1 is_terminal = True else: r = 0 s_ = (np.clip(s[0]-1, 0, 3), s[1]) else: # right if s == (1, 1): r = -1 is_terminal = True elif s == (1, 2): r = 1 is_terminal = True else: r = 0 s_ = (np.clip(s[0]+1, 0, 3), s[1]) return r, s_, is_terminal def RL_Q_network(self): # 使用yield函数实现同步绘图 for episode in range(self.max_episode): s = (0, 0) self.status = s is_terminal = False yield step = 0 while is_terminal is False: a = self.choose_action_by_epsilon_greedy(np.array(s)) r, s_, is_terminal = self.get_environment_feedback(s, a) self.store_transition(np.array(s), a, r, np.array(s_)) # 每5步进行一次学习 if step > 100 and step % 5 == 0: cost = self.learn() print('cost: %.3f' % cost) s = s_ self.status = s step += 1 yield if __name__ == '__main__': app = wx.App() Maze(None) app.MainLoop() 帮我分析一下这段代码吧,最好加上注释
最新发布
09-08
评论 1
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值