Programming Challenge


Postgraduate Interview – Programming Challenge
nanoGPT is a lightweight framework for decoder-only autoregressive models. Understanding both decoder-only models and this 
repository will be very helpful for this postgraduate position. 
Objective:
Train a model on the tiny Shakespeare data set. Use the model to sample what Hamlet might have 
said using the following prompt: “To be, or not to be, that is the”. Find the top 5 next words that 
could have followed this famous snippet.
Guidelines:
• Code:
o Use as much of the existing code in the nanoGPT repository as you need – you do 
not need to re-program something that already exists.
o Be prepared to discuss the code structure and any modifications you have made.
• Tokenization:
o You may use either character-level or tiktoken gpt-2 tokenization.
• Model Size:
o You may train whatever size model fits on your CPU. The evaluation will not be 
based on model complexity or performance.
• Training:
o Create training/validation loss curves.
• Sampling:
o Return the 5 words that your model predicts are most likely to follow the seed 
prompt and include their probabilities.
• Submission: 
o Submit your code (and figures) as a Jupyter Notebook (either standalone .ipynb or 
hosted on Google Colab) with any scripts you have modified from the nanoGPT 
repository.

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值