基于gpt-2模型（117M预训练模型）的文本自动生成测试

最新推荐文章于 2025-06-24 18:42:01 发布

原创

最新推荐文章于 2025-06-24 18:42:01 发布 · 1w 阅读

5 ·

CC 4.0 BY-SA版权

文章标签：

#gpt-2

openai的gpt-2模型最近在风口浪尖上。Language Models are Unsupervised Multitask Learners论文已经出来，但是由于该模型没有将训练过程开源出来，所以本博客仅仅是针对已经公布的117M的预训练模型进行测试。

1、论文贡献

In this paper, we connect these two lines of work and continue the trend of more general methods of transfer. We demonstrate language models can perform down-stream tasks in a zero-shot setting – without any parameter or architecture modification. We demonstrate this approach shows potential by highlighting the ability of language models to perform a wide range of tasks in a zero-shot setting. We achieve promising, competitive, and state of the art results depending on the task.

找更大数量的无监督训练数据来执行多任务学习，使模型更具泛化能力。论文实验也证明了该模型具有惊人的效果。

该论文的模型大部分还是遵循GPT－1的模型，但有两点不同的是：

（1）训练数据集更加庞大；

（2）在第二阶段时候，无监督地做多样性的任务。

2、117M的实验测试

执行测试程序，效果如下：