1. returned non-zero exit status 1.
One epoch之后报错,信息如下:
RuntimeError: Expected to have finished reduction in the prior iteration before
starting a new one. This error indicates that your module has parameters that were
not used in producing loss. You can enable unused parameter detection by (1)
passing the keyword argument `find_unused_parameters=True` to
`torch.nn.parallel.DistributedDataParallel`; (2) making sure all `forward` function
outputs participate in calculating loss. If you already have done the above two
steps, then the distributed data parallel module wasn't able to locate the output
tensors in the return value of your module's `forward` funct