Theano 模块基础知识篇

最新推荐文章于 2025-05-11 12:15:01 发布

原创最新推荐文章于 2025-05-11 12:15:01 发布

· 4.8k 阅读

1 ·

版权

文章标签：

#Theano

python 专栏收录该内容

28 篇文章

订阅专栏

本文介绍了Theano的基础知识，包括函数应用、共享变量的使用、梯度计算以及在Debug Mode下的操作。同时，讲解了如何在CPU和GPU上配置Theano，支持的数据类型包括float64和float32。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

基础知识：

符号变量：

>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import theano
>>> # 根据规定，tensor 子模块重名为T
... import theano.tensor as T
>>> # The theano.tensor 子模块拥有多种基本变量类型
... # Here, we're defining a scalar标量 (0-d) variable.
... # The argument参数 gives the variable its name.
... foo = T.scalar('foo')
>>> # Now, we can define another variable bar which is just foo squared平方.
... bar = foo**2
>>> # It will also be a theano variable.
... print type(bar)
<class 'theano.tensor.var.TensorVariable'>
>>> print bar.type
TensorType(float64, scalar)
>>> # Using theano's pp (pretty print)标准打印 function, we see that
... # bar is defined symbolically as the square of foo
... print theano.pp(bar)
(foo ** TensorConstant{2})

函数：

>>> #想进行与theano有关的计算，需要定义能够被实际的值调用并返回一个实际值得函数
。
... # 我们不能用theano的对象进行任何计算
... # We need to define a theano function first.
... # The first argument of theano.function defines the inputs to the function.
... # Note that bar relies on foo, so foo is an input to this function.
... # theano.function 将接收foo的值，编写代码来计算bar的值
... f = theano.function([foo], bar)
>>> print f(3)
9.0
>>>
>>> # Alternatively或者, in some cases you can use a symbolic variable's符号变量
的 eval求值 method.
... # This can be more convenient than defining a function.
... # The eval method takes a dictionary where the keys are theano variables and
 the values are values for those variables.eval方法需要传入一个keys是theano变量
，values是变量的值的字典参数。
... print bar.eval({foo: 3})
9.0

>>> # We can also use Python functions to construct Theano variables.
... # It seems pedantic here, but can make syntax cleaner for more complicated e
xamples.
... def square(x):
...     return x**2
...
>>> bar = square(foo)
>>> print bar.eval({foo: 3})
9.0

>>> # We can also use Python functions to construct Theano variables.
... # It seems pedantic迂腐 here, 但能够使复杂的实例的语法更加清晰.
... def square(x):
...     return x**2
...
>>> bar = square(foo)
>>> print bar.eval({foo: 3})
9.0

theano.tensor：

>>> #Theano also has variable types for vectors向量, matrices矩阵（复）, and ten
sors张量. The theano.tensor submodule has various functions for performing opera
tions进行操作 on these variables.
... A = T.matrix('A')
>>> x = T.vector('x')
>>> b = T.vector('b')
>>> y = T.dot(A, x) + b #乘法 矩阵每行乘向量每行加向量
>>> # Note that squaring a matrix矩阵（单） is element-wise元素方式
... z = T.sum(A**2) #元素和
>>> # theano.function can compute multiple things at a time第一个参数作为列表可
传入多个theano对象
... # You can also set default parameter values设置默认参数
... linear_mix = theano.function([A, x, theano.Param(b, default=np.array([0, 0])
)], [y, z])
>>> # We'll cover讨论 theano.config.floatX later
... print linear_mix(np.array([[1, 2, 3],
...                            [4, 5, 6]], dtype=theano.config.floatX), #A
...                  np.array([1, 2, 3], dtype=theano.config.floatX), #x
...                  np.array([4, 5], dtype=theano.config.floatX)) #b
[array([ 18.,  37.]), array(91.0)]
>>> # Using the default value for b
... print linear_mix(np.array([[1, 2, 3],
...                            [4, 5, 6]]), #A
...                  np.array([1, 2, 3])) #x
[array([ 14.,  32.]), array(91.0)]

共享变量：

共享变量的特殊性在于，没有显式的值但能够通过set/get方法更新并被function共享。共享变量存在function调用过程中。
1.初始化

>>> shared_var = theano.shared(np.array([[1, 2], [3, 4]], dtype=theano.config.fl
oatX))
>>> # The type of the shared variable is deduced推导 from its initialization
... print shared_var.type()
<TensorType(float64, matrix)>

2.set/get

>>> # We can set the value of a shared variable using set_value
... shared_var.set_value(np.array([[3, 4], [2, 1]], dtype=theano.config.floatX))

>>> # ..and get it using get_value
... print shared_var.get_value()
[[ 3.  4.]
 [ 2.  1.]]

3.在function中隐藏

>>> shared_squared = shared_var**2
>>> # The first argument of theano.function (inputs) tells Theano what the argum
ents to the compiled编译 function should be.
... # Note that because shared_var is shared, it already has a value, so it does
n't need to be an input to the function.
... # Therefore, Theano implicitly暗中 considers shared_var an input to a functi
on using shared_squared and so we don't need
... # to include it in the inputs argument of theano.function.
... function_1 = theano.function([], shared_squared)
>>> print function_1()
[[  9.  16.]
 [  4.   1.]]

4.利用function来更新共享变量：

>>> #共享变量的值可以通过用function中的更新参数来更新。
... subtract = T.matrix('subtract')#更新参数
>>> # updates takes a dict where keys are shared variables and values are the ne
w value the shared variable should take
... # Here, updates will set shared_var = shared_var - subtract 更新
... function_2 = theano.function([subtract], shared_var, updates={shared_var: sh
ared_var - subtract})
>>> print "shared_var before subtracting [[1, 1], [1, 1]] by using function_2:"
shared_var before subtracting [[1, 1], [1, 1]] by using function_2:
>>> print shared_var.get_value()
[[ 3.  4.]
 [ 2.  1.]]
>>> # Subtract [[1, 1], [1, 1]] from shared_var
... function_2(np.array([[1, 1], [1, 1]]))
array([[ 3.,  4.],
       [ 2.,  1.]])
>>> print "shared_var after calling function_2:"
shared_var after calling function_2:
>>> print shared_var.get_value()
[[ 2.  3.]
 [ 1.  0.]]
>>> # Note that this also changes the output of function_1, because shared_var i
s shared!
... print "New output of function_1() (shared_var**2):"
New output of function_1() (shared_var**2):
>>> print function_1()
[[ 4.  9.]
 [ 1.  0.]]

渐变：

Theano提供的另一个极大地便利是能够计算 gradients梯度.提供了函数能够快速的计算 derivative导数而不用进行实际的 derive推导。

>>> # Recall回忆起 that bar = foo**2
... # We can compute the gradient of bar with respect to foo like so:
... bar_grad = T.grad(bar, foo)#arg1 对 arg2 求导
>>> # We expect that bar_grad = 2*foo
... bar_grad.eval({foo: 10})
array(20.0)
>>>
>>> # Recall that y = Ax + b
... # We can also compute a Jacobian雅可比式（行列式/矩阵），导数行列式 like so:

... y_J = theano.gradient.jacobian(y, x)
>>> linear_mix_J = theano.function([A, x, b], y_J)
F:\_python\lib\site-packages\theano\tensor\subtensor.py:110: FutureWarning: comp
arison to `None` will result in an elementwise object comparison in the future.
  start in [None, 0] or
F:\_python\lib\site-packages\theano\gof\cmodule.py:284: RuntimeWarning: numpy.nd
array size changed, may indicate binary incompatibility
  rval = __import__(module_name, {}, {}, [module_name])
>>> # Because it's a linear mix线性组合, we expect the output to always be A
... print linear_mix_J(np.array([[9, 8, 7], [4, 5, 6]]), #A
...                    np.array([1, 2, 3]), #x
...                    np.array([4, 5])) #b
[[ 9.  8.  7.]
 [ 4.  5.  6.]]
>>> # We can also compute the Hessian黑塞矩阵 with theano.gradient.hessian (skip
ping that here)

雅可比矩阵:是一阶偏导数以一定方式排列成的矩阵，其行列式称为雅可比行列式。体现了一个可微方程与给出点的最优线性逼近。
黑塞矩阵：是一个多元函数的二阶偏导数构成的方阵，描述了函数的局部曲率。
楚天舒在豆瓣上对Jacobian 和 Hessian 的解释。
# 首先类比一下一维。Jacobian相当于一阶导数，Hessian相当于二阶导数。一维函数的导数的motivation是很明显的。二阶导数的零点就是一阶导数的极值点。对于很多应用，我们不仅关心一阶导数的零点（也就是函数的极值点），也关心一阶导数的极值点，比如信号处理中，信号的一阶导数的极值点反映信号变化的最剧烈程度。极值点寻求在编程时不方便，不如找二阶导数的零点。
# Jacobian对于标量函数f: Rn-> R1，实际是个向量，这个向量实际上就是函数的梯度gradient。gradient根据Cauchy-Swartz公式，指向的是在某处方向导数取极大值的方向。在二维图像处理中，可用gradient来检测灰度值的边缘。
# 对于向量场F: Rn-> Rm, Jacobian的每一行实际都是一个梯度。且有 F（X)=F(P)+J(P)(X-P)+O(||X-P||) 这个式子的每一行都是一个分量的局部线性化。
# 考虑一个二维的数字图像线性变换（Homography， image warping), 以有限差分代替微分，可作类似分析。
# H: 像素（x,y)-->像素(u,v)
# u=u（x,y) v=v(x,y)
# 则其Jacobian为
# [ u'(x) u'(y)]
# [ v'(x) v'(y)]
# 反映了局部图像的变形程度。
# 最理想的情况 u'(x)=1,v'(y)=1,u'(y)=0,v'(x)=0.说明图像维持原状。
# 由于 dudv=|det(Jacobian(x,y))|dxdy （此式的有效性可参考换元法）
# [注：]有的书上称det(Jacobian(x,y)）为Jacobian.
# 说明面积微元改变的程度由|det(Jacobian(x,y))|决定
# 当|det(Jacobian(x,y))|=1时，说明面积不变，
# 当|det(Jacobian(x,y))|<1时，说明面积压缩，出现了像素丢失现象。
# 当|det(Jacobian(x,y))|>1时，说明面积扩张，需要进行像素插值。
# 另外，由Jacobian矩阵的特征值或奇异值，可作类似说明。可参考Wielandt-Hoffman定理
# Hessian矩阵定义在标量函数上，对于矢量函数，则成为一个rank 3的张量。

Debug Mode:

>>> #因为在Theano中实际运行的代码可能和编写的代码大相径庭，debug可能会很困难。默
认情况下Theano编译代码达到尽可能快的目的，然而也可以为了方便debug以牺牲速度的方
式进行代码的编译。
... # A simple division function
... num = T.scalar('num')
>>> den = T.scalar('den')
>>> divide = theano.function([num, den], num/den)
>>> print divide(10, 2)
5.0
>>> # This will cause a NaN
... print divide(0, 0)
nan
>>>
>>> # To compile a function in debug mode, just set mode='DebugMode'
... divide = theano.function([num, den], num/den, mode='DebugMode')
>>> # NaNs now cause errors
... print divide(0, 0)
Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "F:\_python\lib\site-packages\theano\compile\function_module.py", line 57
9, in __call__
    outputs = self.fn()
  File "F:\_python\lib\site-packages\theano\compile\debugmode.py", line 2030, in
 deco
    return f()
  File "F:\_python\lib\site-packages\theano\compile\debugmode.py", line 1804, in
 f
    specific_hint=hint2)
theano.compile.debugmode.InvalidValueError: InvalidValueError
        type(variable) = TensorType(float64, scalar)
        variable       = Elemwise{true_div,no_inplace}.0
        type(value)    = <type 'numpy.ndarray'>
        dtype(value)   = float64
        shape(value)   = ()
        value          = nan
        min(value)     = nan
        max(value)     = nan
        isfinite       = False
        client_node    = None
        hint           = perform output
        specific_hint  = non-finite elements not allowed
        context        = ...
  Elemwise{true_div,no_inplace} [@A] ''
   |num [@B]
   |den [@C]

使用 CPU 与 GPU：

#Theano can transparently透彻的 compile onto different hardware硬件. What device策略 it uses by default depends on your .theanorc file and any environment variables defined, as described in detail here: http://deeplearning.net/software/theano/library/config.html Currently, you should use float32 when using most GPUs, but most people prefer to use float64 on a CPU. For convenience方便起见, Theano provides the floatX configuration variable配置变量 which designates指定 what float accuracy to use. For example,
#you can run a Python script脚本 with certain environment variables set to use the CPU:
#THEANO_FLAGS=device=cpu,floatX=float64 python your_script.py
# 或 GPU：
#THEANO_FLAGS=device=gpu,floatX=float32 python your_script.py

# You can get the values being used to configure配置 Theano like so:
print theano.config.device
print theano.config.floatX

# You can also get/set them at runtime:
old_floatX = theano.config.floatX
theano.config.floatX = 'float32'

# Be careful that you're actually using floatX!
# For example, the following will cause var to be a float64 regardless of floatX due to numpy defaults:
var = theano.shared(np.array([1.3, 2.4]))
print var.type() #!!!

# So, whenever you use a numpy array, make sure to set its dtype to theano.config.floatX
var = theano.shared(np.array([1.3, 2.4], dtype=theano.config.floatX))
print var.type()
# Revert to old value
theano.config.floatX = old_floatX

cpu
float64
<TensorType(float64, vector)>
<TensorType(float32, vector)>

====================================================================================================================================