Lasagne深度学习框架中的自定义层开发指南-优快云博客

本文链接：https://blog.youkuaiyun.com/gitblog_00572/article/details/148552209

Lasagne深度学习框架中的自定义层开发指南

Lasagne Lightweight library to build and train neural networks in Theano 项目地址: https://gitcode.com/gh_mirrors/la/Lasagne

前言

Lasagne是一个轻量级的深度学习框架，构建在Theano之上。它提供了简单而灵活的接口来构建和训练神经网络。在实际应用中，我们经常需要创建自定义层来实现特定的功能。本文将详细介绍如何在Lasagne中创建不同类型的自定义层。

基础自定义层实现

最简单的自定义层只需要实现get_output_for()方法。这个方法接收输入张量（Theano符号表达式）并返回处理后的输出。

class DoubleLayer(lasagne.layers.Layer):
    def get_output_for(self, input, **kwargs):
        return 2 * input

这个例子创建了一个将输入值翻倍的层。注意以下几点：

必须继承lasagne.layers.Layer基类
get_output_for()方法必须返回Theano表达式
默认情况下，Lasagne会假设输出形状与输入相同

改变数据形状的自定义层

当层的操作会改变输入数据的形状时，除了get_output_for()外，还需要实现get_output_shape_for()方法。

class SumLayer(lasagne.layers.Layer):
    def get_output_for(self, input, **kwargs):
        return input.sum(axis=-1)

    def get_output_shape_for(self, input_shape):
        return input_shape[:-1]

这个层实现了沿最后一个轴求和的功能。关键点：

get_output_shape_for()必须返回一个整数元组（非符号表达式）
形状计算必须准确，因为它会影响后续层的参数初始化
Lasagne使用形状传播机制自动计算网络中各层的尺寸

带参数的自定义层

许多层需要可训练的参数，如权重矩阵。在Lasagne中，参数应通过add_param()方法创建和注册。

class DotLayer(lasagne.layers.Layer):
    def __init__(self, incoming, num_units, W=lasagne.init.Normal(0.01), **kwargs):
        super(DotLayer, self).__init__(incoming, **kwargs)
        num_inputs = self.input_shape[1]
        self.num_units = num_units
        self.W = self.add_param(W, (num_inputs, num_units), name='W')

    def get_output_for(self, input, **kwargs):
        return T.dot(input, self.W)

    def get_output_shape_for(self, input_shape):
        return (input_shape[0], self.num_units)

实现要点：

构造函数必须首先调用父类构造函数
使用add_param()创建和注册参数
可以通过构造函数参数指定初始化策略
self.input_shape在父类构造函数调用后可用

具有多种行为的自定义层

某些层在不同模式下需要不同的行为，如训练和测试时的Dropout层。

from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams
_srng = RandomStreams()

class DropoutLayer(Layer):
    def __init__(self, incoming, p=0.5, **kwargs):
        super(DropoutLayer, self).__init__(incoming, **kwargs)
        self.p = p

    def get_output_for(self, input, deterministic=False, **kwargs):
        if deterministic:
            return input
        else:
            retain_prob = 1 - self.p
            input /= retain_prob
            return input * _srng.binomial(input.shape, p=retain_prob,
                                        dtype=theano.config.floatX)

关键特性：