generation_config = GenerationConfig(
temperature=temperature, #当温度较高时,生成的结果更加多样化,但也更加不确定;当温度较低时,生成的结果更加确定,但也更加单调
top_p=top_p,
top_k=top_k, #每个时间步选择概率最高的 K 个词,然后在这个K个词中随机选一个,do_sample=True时才能生效
num_beams=num_beams,
**kwargs,
)
generation_output = model.generate(
**inputs,
generation_config=generation_config,
return_dict_in_generate=True,
output_scores=True,
max_new_tokens=max_new_tokens,
do_sample=False, #贪心采样,每次取概率最大的
)
model.generate 中以下参数含义:
max_new_tokens=40 # 新词最大数
no_repeat_ngram_size=2 # so that no 2-gram appears twice:
early_stopping=True
num_return_sequences (num_return_sequences <= num_beams!)
set_seed(42)
Top-p sampling picks the minimum number of words to exceed together p=92% of the probability mass
计算转移矩阵
outputs = model.generate(**inputs, max_new_tokens=5, return_dict_in_generate=True, output_scores=True)
transition_scores = model.compute_transition_scores(
outputs.sequences, outputs.scores, normalize_logits=True
)