NLP项目5-阅读理解
阅读理解
-
即给出一篇文档让机器阅读, 在机器读完后问机器一个关于这份文档的问题让机器来作答. 当然这个问题的答案一定要出现在这个文档中
-
Squad原数据格式
-
编码之后格式
1.加载分词工具
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')
print(tokenizer)
tokenizer('What is your name?', 'My name is Sylvain.')
DistilBertTokenizerFast(name_or_path='distilbert-base-uncased', vocab_size=30522, model_max_length=512, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'})
{'input_ids': [101, 2054, 2003, 2115, 2171, 1029, 102, 2026, 2171, 2003, 25353, 22144, 2378, 1012, 102], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]}
2.加载数据集
from datasets import load_dataset
dataset = load_dataset('squad')
dataset
0%| | 0/2 [00:00<?, ?it/s]
DatasetDict({
train: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 87599
})
validation: Dataset({
features: ['id', 'title', 'context', 'question', 'answers'],
num_rows: 10570
})
})
dataset['train'][0]
{'id': '5733be284776f41900661182',
'title': 'University_of_Notre_Dame',
'context': 'Architecturally, the school has a Catholic character. Atop the Main Building\'s gold dome is a golden statue of the Virgin Mary. Immediately in front of the Main Building and facing it, is a copper statue of Christ with arms upraised with the legend "Venite Ad Me Omnes". Next to the Main Building is the Basilica of the Sacred Heart. Immediately behind the basilica is the Grotto, a Marian place of prayer and reflection. It is a replica of the grotto at Lourdes, France where the Virgin Mary reputedly appeared to Saint Bernadette Soubirous in 1858. At the end of the main drive (and in a direct line that connects through 3 statues and the Gold Dome), is a simple, modern stone statue of Mary.',
'question': 'To whom did the Virgin Mary allegedly appear in 1858 in Lourdes France?',
'answers': {'text': ['Saint Bernadette Soubirous'], 'answer_start': [515]}}
3.数据采样 Squad原数据格式 编码之后格式
dataset['train'] = dataset['train'].shuffle().select(range(10000))
dataset['validation'] = dataset['validation'].shuffle().select(range(200))
def prepare_train_features(examples):
# 有些问题在左边有很多空格,这是没有用的,并且会使上下文的截断失败(标记化的问题将占用大量空间)。所以去掉左边的空白
examples["question"] = [q.lstrip() for q in examples["question"]]
# 用截断和填充对示例进行标记,但是用步幅保留溢出。这导致在一个例子中当上下文很长时,可能会给出几个特性,其中每个特性的上下文都与前一个特性的上下文有一点重叠
tokenized_examples = tokenizer(
examples['question'],
examples['context'],
truncation='only_second',
max_length=384,
stride=128,
return_overflowing_tokens=True,
return_offsets_mapping=True,
padding='max_length',)
# 因为如果一个例子有一个很长的上下文,它可能会给几个特征,所以需要一个从特征到它对应的例子的映射。这把钥匙给了我们这个
sample_mapping = tokenized_examples.pop("overflow_to_sample_mapping")
# 偏移映射将提供一个从标记到原始上下文中字符位置的映射。这将帮助计算start_positions和end_positions
offset_mapping = tokenized_examples.pop("offset_mapping")
tokenized_examples["start_positions"] = []
tokenized_examples["end_positions"] = []
for i, offsets in enumerate(offset_mapping):
input_ids = tokenized_examples["input_ids"][i] # 将用CLS令牌的索引标记不可能的答案
cls_index = input_ids.index(tokenizer.cls_token_id)
sequence_ids = tokenized_examples.sequence_ids(i) # 获取与该示例对应序列(了解上下文和问题是什么)
sample_index = sample_mapping[i] # 一个示例可以给出多个跨度,这是包含该文本跨度的示例的索引
answers = examples["answers"][sample_index]
if len(answers["answer_start"]) == 0: # 如果没有给出答案,则将cls_index设置为answer
tokenized_examples["start_positions"].append(cls_index)
tokenized_examples["end_positions"].append(cls_index)
else:
start_char = answers["answer_start"][0] # 正文中答案的开始/结束字符索引
end_char = start_char + len(answers["text"][0])
token_start_index = 0 # 开始文本中当前跨度的标记索引
while sequence_ids[token_start_index] != 1:
token_start_index += 1
token_end_index = len(input_ids) - 1 # 文本中当前跨度的结束标记索引
while sequence_ids[token_end_index] != 1:
token_end_index -= 1 # 检测答案是否超出了范围(在这种情况下,此功能将使用CLS索引标记)
if not (offsets[token_start_index][0] <= start_char
and offsets[token_end_index][1] >= end_char):
tokenized_examples["start_positions"].append(cls_index)
tokenized_examples["end_positions"].append(cls_index)
else: # 否则将token_start_index和token_end_index移动到答案的两端 注意:如果答案是最后一个单词(边缘情况),可跟踪最后一个偏移量
while token_start_index < len(offsets) and offsets[
token_start_index][0] <= start_char:
token_start_index += 1
tokenized_examples["start_positions"].append(
token_start_index - 1)
while offsets[token_end_index][1] >= end_char:
token_end_index -= 1
tokenized_examples["end_positions"].append(token_end_index + 1)
return tokenized_examples
examples = prepare_train_features(dataset['train'][:10])
print(examples)
{'input_ids': [[101, 2054, 2828, 1997, 5069, 2001, 14071, 2033, 2063, 1029, 102, 2004, 14165, 2211, 2000, 2709, 1999, 1996, 3865, 1010, 1996, 2103, 2001, 9860, 2011, 1996, 22894, 2033, 2063, 9288, 1999, 3172, 1010, 2043, 2410, 2111, 2020, 2730, 1999, 2019, 6206, 12219, 2252, 1999, 1996, 2248, 2212, 1010, 5862, 1005, 1055, 22321, 1012, 2927, 2007, 7513, 1005, ...
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], 'start_positions': [41, 177, 100, 188, 109, 56, 60, 61, 18, 98], 'end_positions': [42, 177, 101, 188, 110, 56, 61, 87, 19, 109]}
4.还原成文字来查看
for i in range(len(examples['input_ids'])):
input_ids = examples['input_ids'][i]
start_positions = examples['start_positions'][i]
end_positions = examples['end_positions'][i]
print('问题和文本')
question_and_context = tokenizer.decode(input_ids)
print(question_and_context)
print('答案')
answer = tokenizer.decode(input_ids[start_positions:end_positions])
print('start_positions: ', start_positions)
print('end_positions: ', end_positions)
print(answer)
print('原答案')
original_answer = dataset['train'][i]['answers']['text'][0]
print(original_answer)
print()
问题和文本
[CLS] what type of establishment was wan mee? [SEP] as prosperity began to return in the 1980s, the city was stunned by the wah mee massacre in 1983, when 13 people were killed in an illegal gambling club in the international district, seattle's chinatown. beginning with microsoft's 1979 move from albuquerque, new mexico to nearby bellevue, washington, seattle and its suburbs became home to a number of technology companies including amazon. com, realnetworks, nintendo of america, mccaw cellular ( now part of at & t mobility ), voicestream ( now t - mobile ), and biomedical corporations such as heartstream ( later purchased by philips ), heart technologies ( later purchased by boston scientific ), physio - control ( later purchased by medtronic ), zymogenetics, icos ( later purchased by eli lilly and company ) and immunex ( later purchased by amgen ). this success brought an influx of new residents with a population increase within city limits of almost 50, 000 between 1990 and 2000, and saw seattle's real estate become some of the most expensive in the country. in 1993, the movie sleepless in seattle brought the city further national attention. many of the seattle area's tech companies remained relatively strong, but the frenzied dot - com boom years ended in early 2001. [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
答案
start_positions: 41
end_positions: 42
gambling
原答案
gambling club
5.定义数据集
dataset = dataset.map(
function=prepare_train_features,
batched=True,
remove_columns=['id', 'title', 'context', 'question', 'answers'])
print(dataset['train'][0])
0%| | 0/10 [00:00<?, ?ba/s]
0%| | 0/1 [00:00<?, ?ba/s]
{'input_ids': [101, 2054, 2828, 1997, 5069, 2001, 14071, 2033, 2063, 1029, 102, 2004, 14165, 2211, 2000, 2709, 1999, 1996, 3865, 1010, 1996, 2103, 2001, 9860, 2011, 1996, 22894, 2033, 2063, 9288, 1999, 3172, 1010, 2043, 2410, 2111, 2020, 2730, ...
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'start_positions': 41, 'end_positions': 42}
dataset
DatasetDict({
train: Dataset({
features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
num_rows: 10113
})
validation: Dataset({
features: ['input_ids', 'attention_mask', 'start_positions', 'end_positions'],
num_rows: 202
})
})
6.数据加载器
import torch
from transformers.data.data_collator import default_data_collator
loader = torch.utils.data.DataLoader(
dataset = dataset['train'],
batch_size=16,
collate_fn=default_data_collator,
shuffle=True,
drop_last=True)
for data in loader:
break
len(loader), data
(632,
{'input_ids': tensor([[ 101, 2073, 2003, ..., 0, 0, 0],
[ 101, 25479, 3605, ..., 0, 0, 0],
[ 101, 2054, 3609, ..., 0, 0, 0],
...,
[ 101, 2054, 2001, ..., 0, 0, 0],
[ 101, 2043, 2001, ..., 0, 0, 0],
[ 101, 2247, 2007, ..., 0, 0, 0]]),
'attention_mask': tensor([[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
...,
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0],
[1, 1, 1, ..., 0, 0, 0]]),
'start_positions': tensor([ 22, 46, 137, 86, 13, 178, 47, 11, 88, 55, 56, 70, 88, 79,
31, 19]),
'end_positions': tensor([ 35, 46, 137, 92, 16, 180, 56, 14, 91, 60, 56, 72, 109, 83,
31, 19])})
7.定义下游任务的模型
from transformers import AutoModelForQuestionAnswering, DistilBertModel
class Model(torch.nn.Module):
def __init__(self):
super().__init__()
self.pretrained = DistilBertModel.from_pretrained('distilbert-base-uncased') # 预训练
self.fc = torch.nn.Sequential(torch.nn.Dropout(0.1),torch.nn.Linear(768, 2))
parameters = AutoModelForQuestionAnswering.from_pretrained('distilbert-base-uncased') # 参数
self.fc[1].load_state_dict(parameters.qa_outputs.state_dict()) # 全连接层参数
def forward(self, input_ids, attention_mask, start_positions, end_positions):
# [b,lens] -> embedding -> [b,lens,embed_size] -> pretrained{embed_size,768} -> [b,lens,768]
logits = self.pretrained(input_ids=input_ids, attention_mask=attention_mask)
logits = logits.last_hidden_state
logits = self.fc(logits) # [b, lens, 768] -> fc -> [b, lens, 2]
start_logits, end_logits = logits.split(1, dim=2) # [b, lens, 2] -> [b, lens, 1], [b, lens, 1]
start_logits = start_logits.squeeze(2) # [b, lens, 1] -> [b, lens]
end_logits = end_logits.squeeze(2)
lens = start_logits.shape[1] # 起点和终点都不能超出句子的长度
start_positions = start_positions.clamp(0, lens)
end_positions = end_positions.clamp(0, lens)
criterion = torch.nn.CrossEntropyLoss(ignore_index=lens)
start_loss = criterion(start_logits, start_positions)
end_loss = criterion(end_logits, end_positions)
loss = (start_loss + end_loss) / 2 # 求平均损失
return {'loss': loss, 'start_logits': start_logits, 'end_logits': end_logits}
model = Model()
out = model(**data)
out['loss'], out['start_logits'].shape, out['end_logits'].shape
(tensor(5.9629, grad_fn=<DivBackward0>),
torch.Size([16, 384]),
torch.Size([16, 384]))
out['start_logits']
tensor([[ 0.0228, 0.1262, 0.1578, ..., 0.1435, 0.0201, 0.2220],
[ 0.2419, -0.1595, 0.1950, ..., 0.0106, 0.1423, 0.1011],
[ 0.1407, 0.2449, 0.4859, ..., 0.1164, 0.1793, 0.2311],
...,
[ 0.0916, 0.3904, 0.4410, ..., 0.0520, 0.1593, 0.1145],
[-0.0409, -0.0676, 0.0823, ..., 0.0825, 0.1228, 0.0065],
[ 0.1237, 0.0285, 0.0831, ..., 0.0390, 0.1089, -0.0170]],
grad_fn=<SqueezeBackward1>)
8.测试1
def test(model):
model.eval()
loader_val = torch.utils.data.DataLoader(
dataset=dataset['validation'],
batch_size=32,
collate_fn=default_data_collator,
shuffle=True,
drop_last=True)
start_offset = 0
end_offset = 0
total = 0
for i, data in enumerate(loader_val):
with torch.no_grad():
out = model(**data)
start_offset += (out['start_logits'].argmax(dim=1) - data['start_positions']).abs().sum().item()
end_offset += (out['end_logits'].argmax(dim=1) - data['end_positions']).abs().sum().item() # 位置偏移量
total += 32
if i % 10 == 0:
print(i)
if i == 50:
break
print(start_offset / total, end_offset / total)
start_logits = out['start_logits'].argmax(dim=1)
end_logits = out['end_logits'].argmax(dim=1)
for i in range(4):
input_ids = data['input_ids'][i]
pred_answer = input_ids[start_logits[i]: end_logits[i]]
label_answer = input_ids[data['start_positions'][i] : data['end_positions'][i]]
print('input_ids=', tokenizer.decode(input_ids))
print('pred_answer=', tokenizer.decode(pred_answer))
print('label_answer=', tokenizer.decode(label_answer))
print()
# test(model)
9.训练
from transformers import AdamW
from transformers.optimization import get_scheduler
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
device
device(type='cuda', index=0)
def train():
optimizer = AdamW(model.parameters(), lr=2e-5)
scheduler = get_scheduler(name='linear',
num_warmup_steps=0,
num_training_steps=len(loader),
optimizer=optimizer)
model.to(device)
model.train()
for i, data in enumerate(loader):
input_ids, attention_mask = data['input_ids'], data['attention_mask']
start_positions, end_positions = data['start_positions'], data['end_positions']
input_ids = input_ids.to(device)
attention_mask = attention_mask.to(device)
start_positions = start_positions.to(device)
end_positions = end_positions.to(device)
out = model(input_ids=input_ids,
attention_mask=attention_mask,
start_positions=start_positions,
end_positions=end_positions)
loss = out['loss']
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), 1.0)
optimizer.step()
scheduler.step()
optimizer.zero_grad()
model.zero_grad()
if i % 50 == 0:
lr = optimizer.state_dict()['param_groups'][0]['lr']
start_offset = (out['start_logits'].argmax(dim=1) - start_positions).abs().sum().item() / 16
end_offset = (out['end_logits'].argmax(dim=1) - end_positions).abs().sum().item() / 16 # 位置偏移量
print(i, loss.item(), lr, start_offset, end_offset)
train()
0 5.906922340393066 1.996835443037975e-05 79.5625 83.6875
50 4.296964645385742 1.838607594936709e-05 51.1875 42.625
...
550 2.367091178894043 2.5632911392405064e-06 13.4375 17.875
600 2.4237546920776367 9.810126582278482e-07 20.9375 29.5625
10.模型保存
torch.save(model, '../data/阅读理解.model')
11.模型加载
model2 = torch.load('../data/阅读理解.model', map_location='cpu')
12.测试2
test(model2)
0
13.229166666666666 11.234375
input_ids= [CLS] what nobel memorial prize in economic sciences winner is also a university alumni member? [SEP] in economics, notable nobel memorial prize in economic sciences winners milton friedman, a major advisor to republican u. s. president ronald reagan and conservative british prime minister margaret thatcher, george stigler, nobel laureate and proponent of regulatory capture theory, gary becker, an important contributor to the family economics branch of economics, herbert a. simon, responsible for the modern interpretation of the concept of organizational decision - making, paul samuelson, the first american to win the nobel memorial prize in economic sciences, and eugene fama, known for his work on portfolio theory, asset pricing and stock market behaviour, are all graduates. american economist, social theorist, political philosopher, and author thomas sowell is also an alumnus. [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
pred_answer= milton
label_answer= milton
input_ids= [CLS] where was media day for super bowl 50 held? [SEP] the game's media day, which was typically held on the tuesday afternoon prior to the game, was moved to the monday evening and re - branded as super bowl opening night. the event was held on february 1, 2016 at sap center in san jose. alongside the traditional media availabilities, the event featured an opening ceremony with player introductions on a replica of the golden gate bridge. [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
pred_answer= tuesday afternoon prior to the game, was moved to the monday evening and re - branded as super bowl opening night. the event was held on february 1, 2016 at sap center in san
label_answer= sap center in san jose
input_ids= [CLS] in what unit is the size of the input measured? [SEP] to measure the difficulty of solving a computational problem, one may wish to see how much time the best algorithm requires to solve the problem. however, the running time may, in general, depend on the instance. in particular, larger instances will require more time to solve. thus the time required to solve a problem ( or the space required, or any measure of complexity ) is calculated as a function of the size of the instance. this is usually taken to be the size of the input in bits. complexity theory is interested in how algorithms scale with an increase in the input size. for instance, in the problem of finding whether a graph is connected, how much more time does it take to solve a problem for a graph with 2n vertices compared to the time taken for a graph with n vertices? [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
pred_answer=
label_answer=
input_ids= [CLS] who is the producer of doctor who? [SEP] doctor who is a british science - fiction television programme produced by the bbc since 1963. the programme depicts the adventures of the doctor, a time lord — a space and time - travelling humanoid alien. he explores the universe in his tardis, a sentient time - travelling space ship. its exterior appears as a blue british police box, which was a common sight in britain in 1963 when the series first aired. accompanied by companions, the doctor combats a variety of foes, while working to save civilisations and help people in need. [SEP] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD] [PAD]
pred_answer=
label_answer=