This is a record for implementing the gradient of customized Tensorflow ops in python. That is to say, you have owned debugged Tensorflow ops before you start to write this code.
To make automatic differentiation work for new ops, you must register a gradient function which computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs. The official Tensorflow give a very simple case of a "ZeroOut" op. This op set all of the input to zero except the 1st entry, so the gradient with respect to the input is a sparse "onehot" tensor. The code is shown as follows:
from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import sparse_ops
@ops.RegisterGradient("ZeroOut")
def _zero_out_grad(op, grad):
"""The gradients for `zero_out`.
Args:
op: The `zero_out` `Operation` that we are differentiating, which we can use
to find the inputs and outputs of the original op.
grad: Gradient with respect to the output of the `zero_out` op.
Returns:
Gradients with respect to the input of `zero_out`.
"""
to_zero = op.inputs[0]
shape = array_ops.shape(to_zero)
index = array_ops.zeros_like(shape)
first_grad = array_ops.reshape(grad, [-1])[0]
to_zero_grad = sparse_ops.sparse_to_dense([index], shape, first_grad, 0)
return [to_zero_grad] # List of one Tensor, since we have one input
Now we explain the code, the two arguments of this function provide the access to inputs and outputs of "ZeroOut", as well as the gradient with respect to the output of "ZeroOut" op. Thus, according to the chain rule:
,
we can compute the gradient of loss function with respect to the inputs of current op. In this case, the first value of gradient with respect to output of "ZeroOut" op is maintained and tranfered to the gradient of outputs with repsect to "ZeroOut" inputs.
Here we show another case which defines projection and back-projection ops in medical image reconstruction(CT, MRI). However, command deep learning frameworks usually does not have these typical operations such as projection or backprojection. I have the experience to write these ops, but not for deep learning. We use the tensorflow as a computation engine to run distributed reconstruction tasks. Now we go further, We will embed the reconstruction process into a deep learning networks.
A typical medical image reconstruction process contains many times of projection and backprojection. In the projection step, we project the image X on the detector with some algorithms(commonlly ray-tracing method). And in the backprojeciton step, we drop the values of projection data Y into the image pixels with correponding algorithm (voxel driven method). Mathmatically, the operations can be denoted as
,
where A and A^T is the projecion process and backprojection process. It can be easilly observed that peojection and backprojection ops can be used to compute the gradient of each other. We qoute the Deep Learning Cone Beam CT code for example.
import tensorflow as tf
from tensorflow.python.framework import ops
import os
import math
import numpy as np
import util.numerical as nm
import util.types as t
from inout import dennerlein
import sys
_path = os.path.dirname(os.path.abspath(__file__))
_bp_module = tf.load_op_library( _path + '/../../lib/libbackproject.so' )
backproject = _bp_module.backproject
project = _bp_module.project
'''
Compute the gradient of the backprojection op
by invoking the forward projector.
'''
@ops.RegisterGradient( "Backproject" )
def _backproject_grad( op, grad ):
proj = project(
volume = grad,
geom = op.get_attr( "geom" ),
vol_shape = op.get_attr( "vol_shape" ),
vol_origin = op.get_attr( "vol_origin" ),
voxel_dimen = op.get_attr( "voxel_dimen" ),
proj_shape = op.get_attr( "proj_shape" )
)
return [ proj ]
'''
Compute the gradient of the forward projection op
by invoking the backprojector.
'''
@ops.RegisterGradient( "Project" )
def _project_grad( op, grad ):
vol = backproject(
proj = grad,
geom = op.get_attr( "geom" ),
vol_shape = op.get_attr( "vol_shape" ),
vol_origin = op.get_attr( "vol_origin" ),
voxel_dimen = op.get_attr( "voxel_dimen" ),
proj_shape = op.get_attr( "proj_shape" )
)
return [ vol ]
In these cases, a Project op and a BackProject op was defined as a Tensoflow module and loaded by "tf.load_op_library()" function. Then, for the Backproject op, we call the Project op to compute its gradient. Similarly, we use the Backproject op to compute the gradient of Projection op. op.get_attr(), op.inputs[i] and op.outputs[i] are often used to obtain the information needed in the computation.