When I was training Transformer based on 12M+ source sentences and equal number of target sentences (batch size equals 4096, platform is 4 × T I T A N X p 4\times{TITAN Xp} 4×TITAN
Tensor2Tensor GPU Memory Error During Training
最新推荐文章于 2024-08-08 08:22:14 发布
