阅读《Learning to Ask: Neural Question Generation for Reading Comprehension 》

本文介绍了针对阅读理解的神经网络问题生成模型,该模型基于注意力机制,不依赖复杂的NLP管道。研究发现生成的问题更自然、需要推理回答。模型包括解码器和编码器,其中编码器利用双向LSTM和注意力机制。实验使用了斯坦福的SQuAD数据集,结果显示模型能有效生成与输入句子紧密相关的问题。

阅读《Learning to Ask: Neural Question Generation for Reading Comprehension 》

@(NLP)[自然语言生成|LSTM|QA|Attention]

Abstract

作者为解决机器生成问题,提出了一种基于注意力的序列学习模型并研究了句子级别和段落信息编码之间的影响。与以前的工作不同,他们的模型不依赖手工生成的规则或者复杂的NLP管道(不是很理解,原文为 Sophisticated NLP pipeline )。人工评价生成的问题更自然,也更难回答,与原文在语法和句话上有区别,需要推理回答。

Introduction

Question generate function

In addition to the above applications, question generation systems can aid in the development of annotated data sets for natural language processing (NLP) research in reading comprehension and question answering. Indeed the creation of such datasets.

Example :the natural qusetion and their answers

这里写图片描述

Natural question features

Vanderwende 指出学会问问题是NLP研究一个重要的问题,并且问题不仅仅是一个陈述句句子的句法转换。
自然的问题常常有以下特点:
- In particular, a natural sounding question often compresses the sentence on which it is based (e.g., question 3 in Figure 1)
- 一个自然而然的问题往往明白句子是基于什么的
- uses synonyms for terms in the passage (e.g., “form” for “produce” in question 2 and “get” for “produce” in question 3),
- 使用段落中的同义词
- refers to entities from preceding sentences or clauses (e.g., the use of “photosynthesis” in question 2).
- 涉及到前文或从句中的实体
- Othertimes, world knowledge is employed to produce a good question (e.g., identifying “photosynthesis” as a “life process” in question 1).
- 知识会被用来产生一个好问题

文章中提出的模型不同于以前的模型,它完全由数据驱动,没有手工生成的规则

Task Definition

Goal: to generate a natural question y relation information in the sentence

y can be a sequence of an arbitrary length: [y1,,y|y|][y1,…,y|y|]. Suppose the length of the input sentence is MM, x could then be represented as a sequence of tokens [x1,...,xM][x1,...,xM]. The QG task is defined as finding y, such that:

y¯¯¯=argymaxP(y|x)(1)(1)y¯=argymaxP(y|x)

Model

Decoder

在字级别构造公式一的概率:

P(y|x)=t=1|y|P(yt|x,y<t)P(y|x)=∏t=1|y|P(yt|x,y<t)

where probability of each ytyt is predicted based on all the words that are generated previously (i.e.,y<t)(i.e.,y<t), and input sentence xx.

看公式,最终Qusetion y 的出现概率是每一个词出现概率的乘积,很好理解。

P(yt|x,y<t)=softmax(Wstanh(Wt[ht;c
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值