Improving Neural Machine Translation with Conditional Sequence Generative Adversarial Nets

https://arxiv.org/pdf/1703.04887

This paper proposes a new route for applying the generative adversarial nets (GANs) to NLP tasks (taking the neural machine translation as an instance) and the widespread perspective that GANs can't work well in the NLP area turns out to be unreasonable. In this work, we build a conditional sequence generative adversarial net which comprises of two adversarial sub models, a generative model (generator) which translates the source sentence into the target sentence as the traditional NMT models do and a discriminative model (discriminator) which discriminates the machine-translated target sentence from the human-translated sentence. From the perspective of Turing test, the proposed model is to generate the translation which is indistinguishable from the human-translated one. Experiments show that the proposed model achieves significant improvements than the traditional NMT model. In Chinese-English translation tasks, we obtain up to +2.0 BLEU points improvement. To the best of our knowledge, this is the first time that the quantitative results about the application of GANs in the traditional NLP task is reported. Meanwhile, we present detailed strategies for GAN training. In addition, We find that the discriminator of the proposed model shows great capability in data cleaning.

### 线性可变形卷积技术增强卷积神经网络性能 线性可变形卷积是一种扩展标准卷积操作的技术,允许模型自适应地调整感受野的位置。这种灵活性使得模型能够更好地捕捉不同尺度和形状的目标特征。 #### 基本原理 在传统卷积中,滤波器按照固定的空间布局滑动并提取特征。然而,在处理复杂场景时,物体可能呈现不同的姿态、比例或遮挡情况。为此,线性可变形卷积引入了一个额外的分支来预测每个位置上的偏移量(offset),这些偏移会作用于原始采样点上[^4]: ```python import torch.nn as nn class LinearDeformConv(nn.Module): def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1): super(LinearDeformConv, self).__init__() # 学习偏移量的子网 self.offset_conv = nn.Conv2d(in_channels=in_channels, out_channels=kernel_size * kernel_size * 2, # xy方向各需一个偏移 kernel_size=kernel_size, stride=stride, padding=padding) # 主干卷积层 self.conv = nn.Conv2d(in_channels=in_channels, out_channels=out_channels, kernel_size=kernel_size, stride=stride, padding=padding) def forward(self, x): offsets = self.offset_conv(x) # 预测偏移场 output = deform_conv_function(input=x, offset=offsets, weight=self.conv.weight, bias=self.conv.bias) return output ``` 这里`deform_conv_function()`代表实现具体变形卷积运算的功能函数。通过这种方式,即使面对形态各异的对象实例,也能获得更精准的感受域定位。 #### 性能提升机制 当应用于目标检测任务时,该方法有助于提高边界框回归精度以及分类准确性。特别是在多尺度变换下保持鲁棒性的能力尤为突出[^2]。由于采用了基于数据驱动的方式动态调整空间支持区域,因此相比静态模板化的常规做法更具表达力。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值