warning信息如下:
Warning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
由数据转换而导
本文探讨了在分布式训练过程中遇到的'Grad strides do not match bucket view strides'警告,指出该警告通常由数据转换引起。提供两个示例,eg1说明transpose或permute后进行reshape操作时,应使用.contiguous(),eg2则强调rearrange操作后同样需要加上.contiguous()来避免警告。
订阅专栏 解锁全文
5万+

被折叠的 条评论
为什么被折叠?



