论文笔记A Neural Conversational Model

最新推荐文章于 2024-03-20 17:39:34 发布

原创最新推荐文章于 2024-03-20 17:39:34 发布 · 338 阅读

0 ·

CC 4.0 BY-SA版权

本文探讨了Seq2Seq框架在自然语言理解(NLU)领域的应用优势，包括其端到端训练特性，减少手工规则的需求，以及在神经机器翻译、解析和图像字幕生成等任务上的成功案例。同时，也讨论了该模型的一致性问题和在狭窄领域应用的局限性。

Abstract

1.Previous approaches often restrict to specific domains and require handcrafted rules.
2.Use the Sequence to Sequence frame work.
3.The strength of the model that it can be trained end-to-end and thus requires much fewer hand-crafted rules.
4. The model’s weakness is the lack of consistency is a common failure mode of our model.

Introduction

Neural network can be used to map complicated structures to other complicated structures.
Mapping a sequence to another sequence which has direct applications in NLU. (Sutskever et al., 2014)

The advantage of mapping a seq to another seq

requires little feature engineering and domain specificity, meanwhile matching the best answer of the input.
Allow researchers to work on tasks for domain knowledge, and too hard to design rules.
This approach can do surprisingly well on generating fluent and accurate replies to conversations.

The disadvantage of seq to seq

Due to the complexity of this mapping, conversational modeling has been designed to be very narrow in domain.

Datasets

IT helpdesk datasets of conversations
A noisy datasets of movie subtitles

Experiments

can hold a natural conversation and sometimes perform simple forms of common sense reasoning.
the recurrent nets obtain better performance compared to the n-gram model
capture important long-range correlations.

Related Work

seq2seq的应用
（1）neural machine translation and achieves improvements on the English-French and English-German translation tasks from WMT’14 dataset (Luong et al.,2014; Jean et al., 2014)
（2）parsing(Vinyals et al., 2014a)
（3）image captioning (Vinyals et al., 2014b)

Difference from other conventional systems

lack domain knowledge