Mxnet:以全连接层为例子自定义新的操作(层)

最新推荐文章于 2025-08-30 17:14:59 发布

转载最新推荐文章于 2025-08-30 17:14:59 发布 · 79 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/jukan/p/10934317.html

文章标签：

#人工智能

本文以Mxnet为例，介绍深度学习平台自定义操作的方法。官方给出四种定义新操作的方式，还给出重新定义softmax层的例子，但缺乏有参数中间层示例。博主给出重新定义全连接操作的例子，包含代码实现，并构建了一个简单的多层感知器模型进行训练。

https://blog.youkuaiyun.com/a350203223/article/details/77449630

在使用深度学习平台时，光会使用其中已定义好的操作有时候是满足不了实际使用的，一般需要我们自己定义新的操作。但是，绝大多数深度平台都是编译好的，很难再次编写。本文以Mxnet为例，官方给出四种定义新操作的方法，

分别调用：

１、mx.operator.CustomOp

２、mx.operator.NDArrayOp

３、mx.operator.NumpyOp

４、使用 C++　定义底层

并且给出了重新定义softmax层的例子。但是sofetmax操作只有前向操作，也没有参数，与我们通常需要需要使用的情况不符，官方文档也没有一个有参数的中间层例子。在此博主给出了一个重新定义全连接操作的例子，希望能够给大家带来帮助。

# pylint: skip-file
import os
from data import mnist_iterator
import mxnet as mx
import numpy as np
import logging
from numpy import *

class Dense(mx.operator.CustomOp):

def __init__(self, num_hidden):
self.num_hidden = num_hidden

def forward(self, is_train, req, in_data, out_data, aux):
x = in_data[0]
w = in_data[1]
b = in_data[2]
y = out_data[0]
y[:] = mx.nd.add(mx.nd.dot(x, w.T), b)
self.assign(out_data[0], req[0], mx.nd.array(yy))

def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
dx = in_grad[0]
dw = in_grad[1]
db = in_grad[2]
dy = out_grad[0]
x = in_data[0]
w = in_data[1]
dw[:] = mx.nd.dot(dy.T, x)
dx[:] = mx.nd.dot(dy, w)
db[:] = mx.nd.sum(dy, axis=0)
self.assign(in_grad[0], req[0], dx)
self.assign(in_grad[1], req[0], dw)
self.assign(in_grad[2], req[0], db)

@mx.operator.register("dense")
class DenseProp(mx.operator.CustomOpProp):
def __init__(self, num_hidden):
super(DenseProp, self).__init__(True)
# we use constant bias here to illustrate how to pass arguments
# to operators. All arguments are in string format so you need
# to convert them back to the type you want.
self.num_hidden = long(num_hidden)

def list_arguments(self):
return ['data', 'weight', 'bias']

def list_outputs(self):
# this can be omitted if you only have 1 output.
return ['output']

def infer_shape(self, in_shapes):
data_shape = in_shapes[0]
weight_shape = (self.num_hidden, in_shapes[0][1])
bias_shape = (self.num_hidden,)
output_shape = (data_shape[0], self.num_hidden)
return [data_shape, weight_shape, bias_shape], [output_shape], []

def infer_type(self, in_type):
dtype = in_type[0]
return [dtype, dtype, dtype], [dtype], []

def create_operator(self, ctx, in_shapes, in_dtypes):
# create and return the CustomOp class.
return Dense(self.num_hidden)

# define mlp
data = mx.symbol.Variable('data')
##This is the new defined layer
fc1 = mx.symbol.Custom(data, name='fc1', op_type='dense', num_hidden=128)
act1 = mx.symbol.Activation(data=fc1, name='relu1', act_type="relu")
fc2 = mx.symbol.FullyConnected(data=act1, name = 'fc2', num_hidden = 64)
act2 = mx.symbol.Activation(data = fc2, name='relu2', act_type="relu")
fc3 = mx.symbol.FullyConnected(data = act2, name='fc3', num_hidden=10)
mlp = mx.symbol.Softmax(data = fc3, name = 'softmax')
train, val = mnist_iterator(batch_size=100, input_shape = (784,))
logging.basicConfig(level=logging.DEBUG)
model = mx.model.FeedForward(
ctx = mx.gpu(1), symbol = mlp, num_epoch = 20,
learning_rate = 0.1, momentum = 0.9, wd = 0.00001)
model.fit(X=train, eval_data=val,
batch_end_callback=mx.callback.Speedometer(100,100))
---------------------
作者：启功
来源：优快云
原文：https://blog.youkuaiyun.com/a350203223/article/details/77449630
版权声明：本文为博主原创文章，转载请附上博文链接！