一、方法1
ValueError: Error initializing torch.distributed using env:// rendezvous: environment variable RANK expected, but not set
来源:
>>> import torch
>>> import torch.distributed as dist
>>> #初始化分布式环境,需要设置backend为'nccl'
>>> dist.init_process_group(backend='nccl')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/data/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 81, in wrapper
return func(*args, **kwargs)
File "/data/lib/python3.10/site-packages/torch/distributed/c10d_logger.py", line 95, in wrapper
func_return = func(*args, **kwargs)
File "/data/lib/python3.10/site-packages/torch/distributed/distributed_c10d.py", line