(TensorFlow)——tf.variable_scope和tf.name_scope详解

最新推荐文章于 2022-10-12 14:40:03 发布

原创最新推荐文章于 2022-10-12 14:40:03 发布 · 5.9k 阅读

39 ·

CC 4.0 BY-SA版权

文章标签：

#Tensorflow

深度学习同时被 2 个专栏收录

87 篇文章

订阅专栏

tensorflow

20 篇文章

订阅专栏

本文详细探讨了TensorFlow中变量的创建方式及其区别，包括tf.placeholder、tf.Variable和tf.get_variable，并通过实验对比了它们在命名冲突、变量共享等方面的特性。此外，还介绍了如何利用name_scope和variable_scope对变量进行有效的管理和共享。

1、variable_scope和name_scope存在的价值：
和普通模型相比，深度学习模型的节点（参数）非常多，我们很难确定哪个变量属于哪层。为了解决此问题，所以引入了name_scope和variable_scope，两者分别承担着不同的责任：
*name_scope*: 为了更好的管理变量的命名空间。
*variable_scope*:绝大部分情况下会和tf.get_variable()配合使用，实现变量共享的功能。

2、实验一：
三种方式创建变量：tf.placeholder、tf.Variable、tf.get_variable
（1）实验目的：探索三种方式定义的变量之间的区别：

import tensorflow as tf
# 设置GPU按需增长
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
sess = tf.Session(config=config)

# 1.placeholder 
v1 = tf.placeholder(tf.float32, shape=[2,3,4])
print v1.name
v1 = tf.placeholder(tf.float32, shape=[2,3,4], name='ph')
print v1.name
v1 = tf.placeholder(tf.float32, shape=[2,3,4], name='ph')
print v1.name
print type(v1)
print v1

Placeholder:0
ph:0
ph_1:0
<class 'tensorflow.python.framework.ops.Tensor'>
Tensor("ph_1:0", shape=(2, 3, 4), dtype=float32)

# 2. tf.Variable()
v2 = tf.Variable([1,2], dtype=tf.float32)
print v2.name
v2 = tf.Variable([1,2], dtype=tf.float32, name='V')
print v2.name
v2 = tf.Variable([1,2], dtype=tf.float32, name='V')
print v2.name
print type(v2)
print v2

Variable:0
V:0
V_1:0
<class 'tensorflow.python.ops.variables.Variable'>
Tensor("V_1/read:0", shape=(2,), dtype=float32)

# 3.tf.get_variable() 创建变量的时候必须要提供 name
v3 = tf.get_variable(name='gv', shape=[])  
print v3.name
v4 = tf.get_variable(name='gv', shape=[2])
print v4.name

gv:0
---------------------------------------------------------------------------

ValueError                                Traceback (most recent call last)

<ipython-input-7-29efaac2d76c> in <module>()
      2 v3 = tf.get_variable(name='gv', shape=[])
      3 print v3.name
----> 4 v4 = tf.get_variable(name='gv', shape=[2])
      5 print v4.name
此处还有一堆错误信息。。。
ValueError: Variable gv already exists, disallowed. Did you mean to set reuse=True in VarScope? Originally defined at:
...

print type(v3)
print v3

<class 'tensorflow.python.ops.variables.Variable'>
Tensor("gv/read:0", shape=(), dtype=float32)

接下来要介绍tf.trainable_variables()，他能够将我们定义的所有trainable=True的所有变量以一个list形式返回，我们用他来查看之前所定义的placeholder、variable、get_variable之间的区别：

vs = tf.trainable_variables()
print len(vs)
for v in vs:
    print v

4
Tensor("Variable/read:0", shape=(2,), dtype=float32)
Tensor("V/read:0", shape=(2,), dtype=float32)
Tensor("V_1/read:0", shape=(2,), dtype=float32)
Tensor("gv/read:0", shape=(), dtype=float32)

（2）实验一结论：
我们发现输出结果中没有placeholder，因为它是占位符，当然是不可训练的。
只有tf.get_variable()创建的变量之间会发生命名冲突，在实际使用过程中，三种创建变量方式的用途也是分工明确的：

tf.placeholder()占位符。 *trainable=False*
tf.Variable()一般变量使用这种定义方式 *可以选择trainable类型*
tf.get_variable()一般都是和tf.variable_scope()配合使用。 *可以选择trainable类型*

3、实验二

with tf.name_scope('nsc1'):
    v1 = tf.Variable([1], name='v1')
    with tf.variable_scope('vsc1'):
        v2 = tf.Variable([1], name='v2')
        v3 = tf.get_variable(name='v3', shape=[])
print 'v1.name: ', v1.name
print 'v2.name: ', v2.name
print 'v3.name: ', v3.name

v1.name:  nsc1/v1:0
v2.name:  nsc1/vsc1/v2:0
v3.name:  vsc1/v3:0

with tf.name_scope('nsc1'):
    v4 = tf.Variable([1], name='v4')
print 'v4.name: ', v4.name

v4.name:  nsc1_1/v4:0

tf.name_scope() 并不会对 tf.get_variable() 创建的变量有任何影响。
tf.name_scope() 主要是用来管理命名空间的，这样子让我们的整个模型更加有条理。而 tf.variable_scope() 的作用是为了实现变量共享，它和 tf.get_variable() 来完成变量共享的功能。

1、第一组，用tf.Variable()的方式定义：

def my_image_filter():
    conv1_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv1_weights")
    conv1_biases = tf.Variable(tf.zeros([32]), name="conv1_biases")
    conv2_weights = tf.Variable(tf.random_normal([5, 5, 32, 32]),
        name="conv2_weights")
    conv2_biases = tf.Variable(tf.zeros([32]), name="conv2_biases")
    return None

# First call creates one set of 4 variables.
result1 = my_image_filter()
# Another set of 4 variables is created in the second call.
result2 = my_image_filter()
# 获取所有的可训练变量
vs = tf.trainable_variables()
print 'There are %d train_able_variables in the Graph: ' % len(vs)
for v in vs:
    print v

There are 8 train_able_variables in the Graph: 
Tensor("conv1_weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv1_biases/read:0", shape=(32,), dtype=float32)
Tensor("conv2_weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv2_biases/read:0", shape=(32,), dtype=float32)
Tensor("conv1_weights_1/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv1_biases_1/read:0", shape=(32,), dtype=float32)
Tensor("conv2_weights_1/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("conv2_biases_1/read:0", shape=(32,), dtype=float32)

2.第二种方式，用 tf.get_variable() 的方式

# 下面是定义一个卷积层的通用方式
def conv_relu(kernel_shape, bias_shape):
    # Create variable named "weights".
    weights = tf.get_variable("weights", kernel_shape, initializer=tf.random_normal_initializer())
    # Create variable named "biases".
    biases = tf.get_variable("biases", bias_shape, initializer=tf.constant_initializer(0.0))
    return None


def my_image_filter():
    # 按照下面的方式定义卷积层，非常直观，而且富有层次感
    with tf.variable_scope("conv1"):
        # Variables created here will be named "conv1/weights", "conv1/biases".
        relu1 = conv_relu([5, 5, 32, 32], [32])
    with tf.variable_scope("conv2"):
        # Variables created here will be named "conv2/weights", "conv2/biases".
        return conv_relu( [5, 5, 32, 32], [32])


with tf.variable_scope("image_filters") as scope:
    # 下面我们两次调用 my_image_filter 函数，但是由于引入了 变量共享机制
    # 可以看到我们只是创建了一遍网络结构。
    result1 = my_image_filter()
    scope.reuse_variables()
    result2 = my_image_filter()


# 看看下面，完美地实现了变量共享！！！
vs = tf.trainable_variables()
print 'There are %d train_able_variables in the Graph: ' % len(vs)
for v in vs:
    print v

There are 4 train_able_variables in the Graph: 
Tensor("image_filters/conv1/weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("image_filters/conv1/biases/read:0", shape=(32,), dtype=float32)
Tensor("image_filters/conv2/weights/read:0", shape=(5, 5, 32, 32), dtype=float32)
Tensor("image_filters/conv2/biases/read:0", shape=(32,), dtype=float32)

（2）实验二结论：
首先我们要确立一种Graph的思想。在TensorFlow中，我们定义一个变量，相当于往图中增加了一个节点。和普通的Python程序不同，在一般的函数中，我们对输入进行处理，然后返回一个结果，而定义在函数里面的一些局部变量我们就不管了。但是在tensorflow中我们在函数中创建了一个变量，就是往Graph中添加了一个节点，即使出了这个函数，这个节点还是存在于Graph中。

Ref:
https://blog.youkuaiyun.com/Jerr__y/article/details/70809528