基础知识:
符号变量:
3.
1.初始化
2.set/get
3.在function中隐藏
4.利用function来更新共享变量:
雅可比矩阵:是一阶偏导数以一定方式排列成的矩阵,其行列式称为雅可比行列式。体现了一个可微方程与给出点的最优线性逼近。
黑塞矩阵:是一个多元函数的二阶偏导数构成的方阵,描述了函数的局部曲率。
楚天舒在豆瓣上对Jacobian 和 Hessian 的解释。
# 首先类比一下一维。Jacobian相当于一阶导数,Hessian相当于二阶导数。 一维函数的导数的motivation是很明显的。二阶导数的零点就是一阶导数的极值点。 对于很多应用,我们不仅关心一阶导数的零点(也就是函数的极值点),也关心一阶导数的极值点,比如信号处理中,信号的一阶导数的极值点反映信号变化的最剧烈程度。极值点寻求在编程时不方便,不如找二阶导数的零点。
# Jacobian对于标量函数f: Rn-> R1,实际是个向量,这个向量实际上就是函数的梯度gradient。gradient根据Cauchy-Swartz公式,指向的是在某处方向导数取极大值的方向。在二维图像处理中,可用gradient来检测灰度值的边缘。
# 对于向量场F: Rn-> Rm, Jacobian的每一行实际都是一个梯度。且有 F(X)=F(P)+J(P)(X-P)+O(||X-P||) 这个式子的每一行都是一个分量的局部线性化。
# 考虑一个二维的数字图像线性变换(Homography, image warping), 以有限差分代替微分,可作类似分析。
# H: 像素(x,y)-->像素(u,v)
# u=u(x,y) v=v(x,y)
# 则其Jacobian为
# [ u'(x) u'(y)]
# [ v'(x) v'(y)]
# 反映了局部图像的变形程度。
# 最理想的情况 u'(x)=1,v'(y)=1,u'(y)=0,v'(x)=0.说明图像维持原状。
# 由于 dudv=|det(Jacobian(x,y))|dxdy (此式的有效性可参考换元法)
# [注:]有的书上称det(Jacobian(x,y))为Jacobian.
# 说明面积微元改变的程度由|det(Jacobian(x,y))|决定
# 当|det(Jacobian(x,y))|=1时,说明面积不变,
# 当|det(Jacobian(x,y))|<1时,说明面积压缩,出现了像素丢失现象。
# 当|det(Jacobian(x,y))|>1时,说明面积扩张,需要进行像素插值。
# 另外,由Jacobian矩阵的特征值或奇异值,可作类似说明。可参考Wielandt-Hoffman定理
# Hessian矩阵定义在标量函数上,对于矢量函数,则成为一个rank 3的张量。
符号变量:
>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import theano
>>> # 根据规定,tensor 子模块重名为T
... import theano.tensor as T
>>> # The theano.tensor 子模块拥有多种基本变量类型
... # Here, we're defining a scalar标量 (0-d) variable.
... # The argument参数 gives the variable its name.
... foo = T.scalar('foo')
>>> # Now, we can define another variable bar which is just foo squared平方.
... bar = foo**2
>>> # It will also be a theano variable.
... print type(bar)
<class 'theano.tensor.var.TensorVariable'>
>>> print bar.type
TensorType(float64, scalar)
>>> # Using theano's pp (pretty print)标准打印 function, we see that
... # bar is defined symbolically as the square of foo
... print theano.pp(bar)
(foo ** TensorConstant{2})
函数:
1.
>>> #想进行与theano有关的计算,需要定义能够被实际的值调用并返回一个实际值得函数
。
... # 我们不能用theano的对象进行任何计算
... # We need to define a theano function first.
... # The first argument of theano.function defines the inputs to the function.
... # Note that bar relies on foo, so foo is an input to this function.
... # theano.function 将接收foo的值,编写代码来计算bar的值
... f = theano.function([foo], bar)
>>> print f(3)
9.0
>>>
>>> # Alternatively或者, in some cases you can use a symbolic variable's符号变量
的 eval求值 method.
... # This can be more convenient than defining a function.
... # The eval method takes a dictionary where the keys are theano variables and
the values are values for those variables.eval方法需要传入一个keys是theano变量
,values是变量的值的字典参数。
... print bar.eval({foo: 3})
9.0
>>> # We can also use Python functions to construct Theano variables.
... # It seems pedantic here, but can make syntax cleaner for more complicated e
xamples.
... def square(x):
... return x**2
...
>>> bar = square(foo)
>>> print bar.eval({foo: 3})
9.0
3.
>>> # We can also use Python functions to construct Theano variables.
... # It seems pedantic迂腐 here, 但能够使复杂的实例的语法更加清晰.
... def square(x):
... return x**2
...
>>> bar = square(foo)
>>> print bar.eval({foo: 3})
9.0
theano.tensor:
>>> #Theano also has variable types for vectors向量, matrices矩阵(复), and ten
sors张量. The theano.tensor submodule has various functions for performing opera
tions进行操作 on these variables.
... A = T.matrix('A')
>>> x = T.vector('x')
>>> b = T.vector('b')
>>> y = T.dot(A, x) + b #乘法 矩阵每行乘向量每行加向量
>>> # Note that squaring a matrix矩阵(单) is element-wise元素方式
... z = T.sum(A**2) #元素和
>>> # theano.function can compute multiple things at a time第一个参数作为列表可
传入多个theano对象
... # You can also set default parameter values设置默认参数
... linear_mix = theano.function([A, x, theano.Param(b, default=np.array([0, 0])
)], [y, z])
>>> # We'll cover讨论 theano.config.floatX later
... print linear_mix(np.array([[1, 2, 3],
... [4, 5, 6]], dtype=theano.config.floatX), #A
... np.array([1, 2, 3], dtype=theano.config.floatX), #x
... np.array([4, 5], dtype=theano.config.floatX)) #b
[array([ 18., 37.]), array(91.0)]
>>> # Using the default value for b
... print linear_mix(np.array([[1, 2, 3],
... [4, 5, 6]]), #A
... np.array([1, 2, 3])) #x
[array([ 14., 32.]), array(91.0)]
共享变量:
1.初始化
>>> shared_var = theano.shared(np.array([[1, 2], [3, 4]], dtype=theano.config.fl
oatX))
>>> # The type of the shared variable is deduced推导 from its initialization
... print shared_var.type()
<TensorType(float64, matrix)>
2.set/get
>>> # We can set the value of a shared variable using set_value
... shared_var.set_value(np.array([[3, 4], [2, 1]], dtype=theano.config.floatX))
>>> # ..and get it using get_value
... print shared_var.get_value()
[[ 3. 4.]
[ 2. 1.]]
3.在function中隐藏
>>> shared_squared = shared_var**2
>>> # The first argument of theano.function (inputs) tells Theano what the argum
ents to the compiled编译 function should be.
... # Note that because shared_var is shared, it already has a value, so it does
n't need to be an input to the function.
... # Therefore, Theano implicitly暗中 considers shared_var an input to a functi
on using shared_squared and so we don't need
... # to include it in the inputs argument of theano.function.
... function_1 = theano.function([], shared_squared)
>>> print function_1()
[[ 9. 16.]
[ 4. 1.]]
4.利用function来更新共享变量:
>>> #共享变量的值可以通过用function中的更新参数来更新。
... subtract = T.matrix('subtract')#更新参数
>>> # updates takes a dict where keys are shared variables and values are the ne
w value the shared variable should take
... # Here, updates will set shared_var = shared_var - subtract 更新
... function_2 = theano.function([subtract], shared_var, updates={shared_var: sh
ared_var - subtract})
>>> print "shared_var before subtracting [[1, 1], [1, 1]] by using function_2:"
shared_var before subtracting [[1, 1], [1, 1]] by using function_2:
>>> print shared_var.get_value()
[[ 3. 4.]
[ 2. 1.]]
>>> # Subtract [[1, 1], [1, 1]] from shared_var
... function_2(np.array([[1, 1], [1, 1]]))
array([[ 3., 4.],
[ 2., 1.]])
>>> print "shared_var after calling function_2:"
shared_var after calling function_2:
>>> print shared_var.get_value()
[[ 2. 3.]
[ 1. 0.]]
>>> # Note that this also changes the output of function_1, because shared_var i
s shared!
... print "New output of function_1() (shared_var**2):"
New output of function_1() (shared_var**2):
>>> print function_1()
[[ 4. 9.]
[ 1. 0.]]
渐变:
>>> # Recall回忆起 that bar = foo**2
... # We can compute the gradient of bar with respect to foo like so:
... bar_grad = T.grad(bar, foo)#arg1 对 arg2 求导
>>> # We expect that bar_grad = 2*foo
... bar_grad.eval({foo: 10})
array(20.0)
>>>
>>> # Recall that y = Ax + b
... # We can also compute a Jacobian雅可比式(行列式/矩阵),导数行列式 like so:
... y_J = theano.gradient.jacobian(y, x)
>>> linear_mix_J = theano.function([A, x, b], y_J)
F:\_python\lib\site-packages\theano\tensor\subtensor.py:110: FutureWarning: comp
arison to `None` will result in an elementwise object comparison in the future.
start in [None, 0] or
F:\_python\lib\site-packages\theano\gof\cmodule.py:284: RuntimeWarning: numpy.nd
array size changed, may indicate binary incompatibility
rval = __import__(module_name, {}, {}, [module_name])
>>> # Because it's a linear mix线性组合, we expect the output to always be A
... print linear_mix_J(np.array([[9, 8, 7], [4, 5, 6]]), #A
... np.array([1, 2, 3]), #x
... np.array([4, 5])) #b
[[ 9. 8. 7.]
[ 4. 5. 6.]]
>>> # We can also compute the Hessian黑塞矩阵 with theano.gradient.hessian (skip
ping that here)
雅可比矩阵:是一阶偏导数以一定方式排列成的矩阵,其行列式称为雅可比行列式。体现了一个可微方程与给出点的最优线性逼近。
黑塞矩阵:是一个多元函数的二阶偏导数构成的方阵,描述了函数的局部曲率。
楚天舒在豆瓣上对Jacobian 和 Hessian 的解释。
# 首先类比一下一维。Jacobian相当于一阶导数,Hessian相当于二阶导数。 一维函数的导数的motivation是很明显的。二阶导数的零点就是一阶导数的极值点。 对于很多应用,我们不仅关心一阶导数的零点(也就是函数的极值点),也关心一阶导数的极值点,比如信号处理中,信号的一阶导数的极值点反映信号变化的最剧烈程度。极值点寻求在编程时不方便,不如找二阶导数的零点。
# Jacobian对于标量函数f: Rn-> R1,实际是个向量,这个向量实际上就是函数的梯度gradient。gradient根据Cauchy-Swartz公式,指向的是在某处方向导数取极大值的方向。在二维图像处理中,可用gradient来检测灰度值的边缘。
# 对于向量场F: Rn-> Rm, Jacobian的每一行实际都是一个梯度。且有 F(X)=F(P)+J(P)(X-P)+O(||X-P||) 这个式子的每一行都是一个分量的局部线性化。
# 考虑一个二维的数字图像线性变换(Homography, image warping), 以有限差分代替微分,可作类似分析。
# H: 像素(x,y)-->像素(u,v)
# u=u(x,y) v=v(x,y)
# 则其Jacobian为
# [ u'(x) u'(y)]
# [ v'(x) v'(y)]
# 反映了局部图像的变形程度。
# 最理想的情况 u'(x)=1,v'(y)=1,u'(y)=0,v'(x)=0.说明图像维持原状。
# 由于 dudv=|det(Jacobian(x,y))|dxdy (此式的有效性可参考换元法)
# [注:]有的书上称det(Jacobian(x,y))为Jacobian.
# 说明面积微元改变的程度由|det(Jacobian(x,y))|决定
# 当|det(Jacobian(x,y))|=1时,说明面积不变,
# 当|det(Jacobian(x,y))|<1时,说明面积压缩,出现了像素丢失现象。
# 当|det(Jacobian(x,y))|>1时,说明面积扩张,需要进行像素插值。
# 另外,由Jacobian矩阵的特征值或奇异值,可作类似说明。可参考Wielandt-Hoffman定理
# Hessian矩阵定义在标量函数上,对于矢量函数,则成为一个rank 3的张量。
Debug Mode:
>>> #因为在Theano中实际运行的代码可能和编写的代码大相径庭,debug可能会很困难。默
认情况下Theano编译代码达到尽可能快的目的,然而也可以为了方便debug以牺牲速度的方
式进行代码的编译。
... # A simple division function
... num = T.scalar('num')
>>> den = T.scalar('den')
>>> divide = theano.function([num, den], num/den)
>>> print divide(10, 2)
5.0
>>> # This will cause a NaN
... print divide(0, 0)
nan
>>>
>>> # To compile a function in debug mode, just set mode='DebugMode'
... divide = theano.function([num, den], num/den, mode='DebugMode')
>>> # NaNs now cause errors
... print divide(0, 0)
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "F:\_python\lib\site-packages\theano\compile\function_module.py", line 57
9, in __call__
outputs = self.fn()
File "F:\_python\lib\site-packages\theano\compile\debugmode.py", line 2030, in
deco
return f()
File "F:\_python\lib\site-packages\theano\compile\debugmode.py", line 1804, in
f
specific_hint=hint2)
theano.compile.debugmode.InvalidValueError: InvalidValueError
type(variable) = TensorType(float64, scalar)
variable = Elemwise{true_div,no_inplace}.0
type(value) = <type 'numpy.ndarray'>
dtype(value) = float64
shape(value) = ()
value = nan
min(value) = nan
max(value) = nan
isfinite = False
client_node = None
hint = perform output
specific_hint = non-finite elements not allowed
context = ...
Elemwise{true_div,no_inplace} [@A] ''
|num [@B]
|den [@C]
使用 CPU 与 GPU:
#Theano can transparently透彻的 compile onto different hardware硬件. What device策略 it uses by default depends on your .theanorc file and any environment variables defined, as described in detail here: http://deeplearning.net/software/theano/library/config.html Currently, you should use float32 when using most GPUs, but most people prefer to use float64 on a CPU. For convenience方便起见, Theano provides the floatX configuration variable配置变量 which designates指定 what float accuracy to use. For example,
#you can run a Python script脚本 with certain environment variables set to use the CPU:
#THEANO_FLAGS=device=cpu,floatX=float64 python your_script.py
# 或 GPU:
#THEANO_FLAGS=device=gpu,floatX=float32 python your_script.py
# You can get the values being used to configure配置 Theano like so:
print theano.config.device
print theano.config.floatX
# You can also get/set them at runtime:
old_floatX = theano.config.floatX
theano.config.floatX = 'float32'
# Be careful that you're actually using floatX!
# For example, the following will cause var to be a float64 regardless of floatX due to numpy defaults:
var = theano.shared(np.array([1.3, 2.4]))
print var.type() #!!!
# So, whenever you use a numpy array, make sure to set its dtype to theano.config.floatX
var = theano.shared(np.array([1.3, 2.4], dtype=theano.config.floatX))
print var.type()
# Revert to old value
theano.config.floatX = old_floatX
cpu
float64
<TensorType(float64, vector)>
<TensorType(float32, vector)>
====================================================================================================================================