Implement the gradient of customized Tensorflow Op in python.-优快云博客

本文介绍如何为TensorFlow自定义算子实现梯度计算，以支持自动微分。通过注册梯度函数，可以计算关于算子输入的梯度。以ZeroOut算子为例，展示了梯度函数的实现。此外，还介绍了在医学图像重建中定义投影和反投影算子的方法，并说明了如何使用这些算子来计算彼此的梯度。

This is a record for implementing the gradient of customized Tensorflow ops in python. That is to say, you have owned debugged Tensorflow ops before you start to write this code.

To make automatic differentiation work for new ops, you must register a gradient function which computes gradients with respect to the ops' inputs given gradients with respect to the ops' outputs. The official Tensorflow give a very simple case of a "ZeroOut" op. This op set all of the input to zero except the 1st entry, so the gradient with respect to the input is a sparse "onehot" tensor. The code is shown as follows:

from tensorflow.python.framework import ops
from tensorflow.python.ops import array_ops
from tensorflow.python.ops import sparse_ops

@ops.RegisterGradient("ZeroOut")
def _zero_out_grad(op, grad):
  """The gradients for `zero_out`.

  Args:
    op: The `zero_out` `Operation` that we are differentiating, which we can use
      to find the inputs and outputs of the original op.
    grad: Gradient with respect to the output of the `zero_out` op.

  Returns:
    Gradients with respect to the input of `zero_out`.
  """
  to_zero = op.inputs[0]
  shape = array_ops.shape(to_zero)
  index = array_ops.zeros_like(shape)
  first_grad = array_ops.reshape(grad, [-1])[0]
  to_zero_grad = sparse_ops.sparse_to_dense([index], shape, first_grad, 0)
  return [to_zero_grad]  # List of one Tensor, since we have one input

Now we explain the code, the two arguments of this function provide the access to inputs and outputs of "ZeroOut", as well as the gradient with respect to the output of "ZeroOut" op. Thus, according to the chain rule:

$\frac{\partial{L}}{\partial{x}} = \frac{\partial{L}}{\partial{y}} \frac{\partial{y}}{\partial{x}}$ ，

we can compute the gradient of loss function with respect to the inputs of current op. In this case, the first value of gradient with respect to output of "ZeroOut" op is maintained and tranfered to the gradient of outputs with repsect to "ZeroOut" inputs.

Here we show another case which defines projection and back-projection ops in medical image reconstruction(CT, MRI). However, command deep learning frameworks usually does not have these typical operations such as projection or backprojection. I have the experience to write these ops, but not for deep learning. We use the tensorflow as a computation engine to run distributed reconstruction tasks. Now we go further, We will embed the reconstruction process into a deep learning networks.

A typical medical image reconstruction process contains many times of projection and backprojection. In the projection step, we project the image X on the detector with some algorithms(commonlly ray-tracing method). And in the backprojeciton step, we drop the values of projection data Y into the image pixels with correponding algorithm (voxel driven method). Mathmatically, the operations can be denoted as

$AX^{old} = Y, ~~~~A^TY=X^{new}$ ,

where A and A^T is the projecion process and backprojection process. It can be easilly observed that peojection and backprojection ops can be used to compute the gradient of each other. We qoute the Deep Learning Cone Beam CT code for example.

import tensorflow as tf
from tensorflow.python.framework import ops
import os
import math
import numpy as np
import util.numerical as nm
import util.types as t
from inout import dennerlein
import sys

_path = os.path.dirname(os.path.abspath(__file__))
_bp_module = tf.load_op_library( _path + '/../../lib/libbackproject.so' )
backproject = _bp_module.backproject
project = _bp_module.project


'''
    Compute the gradient of the backprojection op
    by invoking the forward projector.
'''
@ops.RegisterGradient( "Backproject" )
def _backproject_grad( op, grad ):
    proj = project(
            volume      = grad,
            geom        = op.get_attr( "geom" ),
            vol_shape   = op.get_attr( "vol_shape" ),
            vol_origin  = op.get_attr( "vol_origin" ),
            voxel_dimen = op.get_attr( "voxel_dimen" ),
            proj_shape  = op.get_attr( "proj_shape" )
        )
    return [ proj ]


'''
    Compute the gradient of the forward projection op
    by invoking the backprojector.
'''
@ops.RegisterGradient( "Project" )
def _project_grad( op, grad ):
    vol = backproject(
            proj        = grad,
            geom        = op.get_attr( "geom" ),
            vol_shape   = op.get_attr( "vol_shape" ),
            vol_origin  = op.get_attr( "vol_origin" ),
            voxel_dimen = op.get_attr( "voxel_dimen" ),
            proj_shape  = op.get_attr( "proj_shape" )
        )
    return [ vol ]

In these cases, a Project op and a BackProject op was defined as a Tensoflow module and loaded by "tf.load_op_library()" function. Then, for the Backproject op, we call the Project op to compute its gradient. Similarly, we use the Backproject op to compute the gradient of Projection op. op.get_attr(), op.inputs[i] and op.outputs[i] are often used to obtain the information needed in the computation.