tensorflow自定义网络层、激活函数（self-defined layer）

最新推荐文章于 2023-11-29 16:38:38 发布

mmc2015

最新推荐文章于 2023-11-29 16:38:38 发布

阅读量1.3w

点赞数 5

CC 4.0 BY-SA版权

分类专栏： Theano AND Keras 文章标签： tensorflow 自定义层 self-defined layer

本文链接：https://blog.youkuaiyun.com/mmc2015/article/details/71250090

本文介绍了如何在 TensorFlow 中自定义激活函数，包括将其从 numpy 函数转换为 TensorFlow 函数，并手动定义其梯度。通过示例详细展示了如何创建一个自定义的“spiky”激活函数及其导数，最后进行了测试验证。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

# highly based on :
# http://stackoverflow.com/questions/39921607/tensorflow-how-to-make-a-custom-activation-function-with-only-python
# https://gist.github.com/harpone/3453185b41d8d985356cbe5e57d67342


# making a numpy function to a tensorflow function:
# we will use 1) tf.py_func(func, inp, Tout, stateful=stateful, name=name), https://www.tensorflow.org/api_docs/python/tf/py_func
# which transforms any numpy function to a tensorflow function
# we will use 2) tf.RegisterGradient
# https://www.tensorflow.org/versions/r0.11/api_docs/python/framework/defining_new_operations#RegisterGradient
# https://www.tensorflow.org/versions/r0.11/api_docs/python/framework/#RegisterGradient
# we will use 3) tf.Graph.gradient_override_map
# https://www.tensorflow.org/versions/r0.11/api_docs/python/framework/
# https://www.tensorflow.org/versions/r0.11/api_docs/python/framework/core_graph_data_structures#Graph.gradient_override_map




import numpy as np
import tensorflow as tf
from tensorflow.python.framework import ops


# define common custom relu function
def my_relu_def(x, threshold=0.05):
    if x<threshold:
        return 0.0
    else:
        return x

def my_relu_grad_def(x, threshold=0.05):
    if x<threshold:
        return 0.0
    else:
        return 1.0

# making a common function into a numpy function
my_relu_np = np.vectorize(my_relu_def)
my_relu_grad_np = np.vectorize(my_relu_grad_def)
# numpy uses float64 but tensorflow uses float32
my_relu_np_32 = lambda x: my_relu_np(x).astype(np.float32)
my_relu_grad_np_32 = lambda x: my_relu_grad_np(x).astype(np.float32)



def my_relu_grad_tf(x, name=None):
    with ops.name_scope(name, "my_relu_grad_tf", [x]) as name:
        y = tf.py_func(my_relu_grad_np_32,
                       [x],
                       [tf.float32],
                       name=name,
                       stateful=False)
        return y[0]

def my_py_func(func, inp, Tout, stateful=False, name=None, my_grad_func=None):
    # Need to generate a unique name to avoid duplicates:
    random_name = 'PyFuncGrad' + str(np.random.randint(0, 1E+8))
    tf.RegisterGradient(random_name)(my_grad_func)  # see _my_relu_grad for grad example
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": random_name, "PyFuncStateless": random_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

# The grad function we need to pass to the above my_py_func function takes a special form:
# It needs to take in (an operation, the previous gradients before the operation)
# and propagate(i.e., return) the gradients backward after the operation.
def _my_relu_grad(op, pre_grad):
    x = op.inputs[0]
    cur_grad = my_relu_grad_tf(x)
    next_grad = pre_grad * cur_grad
    return next_grad

def my_relu_tf(x, name=None):
    with ops.name_scope(name, "my_relu_tf", [x]) as name:
        y = my_py_func(my_relu_np_32,
                       [x],
                       [tf.float32],
                       stateful=False,
                       name=name,
                       my_grad_func=_my_relu_grad)  # <-- here's the call to the gradient
        return y[0]

with tf.Session() as sess:
    x = tf.constant([-0.3, 0.005, 0.08, 0.12])
    y = my_relu_tf(x)
    tf.global_variables_initializer().run()
    print x.eval()
    print y.eval()
    print tf.gradients(y, [x])[0].eval()

# [-0.30000001  0.005       0.08        0.12      ]
# [ 0.    0.    0.08  0.12]
# [ 0.    0.    1.  1.]

https://stackoverflow.com/questions/39921607/how-to-make-a-custom-activation-function-with-only-python-in-tensorflow

Yes There is!

Credit: It was hard to find the information and get it working but here is an example copying from the principles and code found here and here.

Requirements: Before we start, there are two requirement for this to be able to succeed. First you need to be able to write your activation as a function on numpy arrays. Second you have to be able to write the derivative of that function either as a function in Tensorflow (easier) or in the worst case scenario as a function on numpy arrays.

Writing Activation function:

So let's take for example this function which we would want to use an activation function:

def spiky(x):
    r = x % 1
    if r <= 0.5:
        return r
    else:
        return 0

Which look as follows:

The first step is making it into a numpy function, this is easy:

import numpy as np
np_spiky = np.vectorize(spiky)

Now we should write its derivative.

Gradient of Activation: In our case it is easy, it is 1 if x mod 1 < 0.5 and 0 otherwise. So:

def d_spiky(x):
    r = x % 1
    if r <= 0.5:
        return 1
    else:
        return 0
np_d_spiky = np.vectorize(

最低0.47元/天解锁文章

200万优质内容无限畅学