FutureWarning: The module torch.distributed.launch is deprecated and will be removed in future.

本文指导如何在服务器上处理PyTorch警告,推荐使用新工具torchrun替代torch.distributed.launch,包括升级PyTorch版本、替换命令行及处理FutureWarning。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

在服务器上跑代码的时候有的时候会出现一系列的警告,虽然不是错误,但是看着也不舒服,于是就动手给他解决一下
要解决这个问题,需要使用torchrun来替代torch.distributed.launch。torchrun是一个新的工具,可以用于在分布式环境中运行PyTorch训练脚本。它已经默认设置了–use-env选项
1、首先要先确保pytorch版本是比较新的,可以使用以下命令升级PyTorch:这里一定要记住升级的时候你的torch版本要符合你的代码所使用的版本,否则不要满盲目更新。

pip install --upgrade torch

2、然后,将训练脚本中的torch.distributed.launch替换为torchrun。例如,如果原始命令如下

python -m torch.distributed.launch --nproc_per_node=2 train.py

将其修改为下面的命令:

python -m torch.distributed.run --use-env --nproc_per_node=2 train.py

如果出现下面的情况
在这里插入图片描述
将命令中的–use-env去掉,使用下面的命令:

python -m torch.distributed.run --nproc_per_node=2 train.py

3、保存并运行修改后的命令。
通过这些步骤,将能够解决FutureWarning并使用torchrun来替代torch.distributed.launch。

/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/utils/hub.py:105: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead. warnings.warn( The following values were not passed to `accelerate launch` and had defaults used instead: More than one GPU was found, enabling multi-GPU training. If this was unintended please pass in `--num_processes=1`. `--num_machines` was set to a value of `1` `--mixed_precision` was set to a value of `'no'` `--dynamo_backend` was set to a value of `'no'` To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`. /home/ustc/anaconda3/lib/python3.12/site-packages/pydub/utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work warn("Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work", RuntimeWarning) /home/ustc/anaconda3/lib/python3.12/site-packages/torch/nn/utils/weight_norm.py:134: FutureWarning: `torch.nn.utils.weight_norm` is deprecated in favor of `torch.nn.utils.parametrizations.weight_norm`. WeightNorm.apply(module, name, dim) /home/ustc/anaconda3/lib/python3.12/site-packages/transformers/utils/hub.py:105: FutureWarning: Using `TRANSFORMERS_CACHE` is deprecated and will be removed in v5 of Transformers. Use `HF_HOME` instead. warnings.warn( [rank0]: Traceback (most recent call last): [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 92, in _call_target [rank0]: return _target_(*args, **kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/modules/astral_quantization/default_model.py", line 22, in __init__ [rank0]: self.tokenizer = WhisperProcessor.from_pretrained( [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/processing_utils.py", line 1079, in from_pretrained [rank0]: args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/processing_utils.py", line 1143, in _get_arguments_from_pretrained [rank0]: args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs)) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/feature_extraction_utils.py", line 384, in from_pretrained [rank0]: feature_extractor_dict, kwargs = cls.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/feature_extraction_utils.py", line 510, in get_feature_extractor_dict [rank0]: resolved_feature_extractor_file = cached_file( [rank0]: ^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/utils/hub.py", line 266, in cached_file [rank0]: file = cached_files(path_or_repo_id=path_or_repo_id, filenames=[filename], **kwargs) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/transformers/utils/hub.py", line 381, in cached_files [rank0]: raise OSError( [rank0]: OSError: ./checkpoints/hf_cache does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./checkpoints/hf_cache/tree/main' for available files. [rank0]: The above exception was the direct cause of the following exception: [rank0]: Traceback (most recent call last): [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/train_v2.py", line 345, in <module> [rank0]: main(args) [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/train_v2.py", line 314, in main [rank0]: trainer = Trainer( [rank0]: ^^^^^^^^ [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/train_v2.py", line 75, in __init__ [rank0]: self._init_models(train_cfm=train_cfm, train_ar=train_ar) [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/train_v2.py", line 106, in _init_models [rank0]: self._init_main_model(train_cfm=train_cfm, train_ar=train_ar) [rank0]: File "/home/ustc/桌面/seed-vc-main-v2/train_v2.py", line 116, in _init_main_model [rank0]: self.model = hydra.utils.instantiate(cfg).to(self.device) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 226, in instantiate [rank0]: return instantiate_node( [rank0]: ^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 342, in instantiate_node [rank0]: value = instantiate_node( [rank0]: ^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 347, in instantiate_node [rank0]: return _call_target(_target_, partial, args, kwargs, full_key) [rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ [rank0]: File "/home/ustc/anaconda3/lib/python3.12/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 97, in _call_target [rank0]: raise InstantiationException(msg) from e [rank0]: hydra.errors.InstantiationException: Error in call to target 'modules.astral_quantization.default_model.AstralQuantizer': [rank0]: OSError("./checkpoints/hf_cache does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./checkpoints/hf_cache/tree/main' for available files.") [rank0]: full_key: content_extractor_narrow [rank0]:[W514 21:22:23.005946238 ProcessGroupNCCL.cpp:1168] Warning: WARNING: process group has NOT been destroyed before we destruct ProcessGroupNCCL. On normal program exit, the application should call destroy_process_group to ensure that any pending NCCL operations have finished in this process. In rare cases this process can exit before this point and block the progress of another member of the process group. This constraint has always been present, but this warning has only been added since PyTorch 2.4 (function operator()) W0514 21:22:24.285000 132883023812096 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 3125405 closing signal SIGTERM W0514 21:22:24.285000 132883023812096 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 3125406 closing signal SIGTERM W0514 21:22:24.285000 132883023812096 torch/distributed/elastic/multiprocessing/api.py:858] Sending process 3125407 closing signal SIGTERM E0514 21:22:24.500000 132883023812096 torch/distributed/elastic/multiprocessing/api.py:833] failed (exitcode: 1) local_rank: 0 (pid: 3125404) of binary: /home/ustc/anaconda3/bin/python Traceback (most recent call last): File "/home/ustc/anaconda3/bin/accelerate", line 8, in <module> sys.exit(main()) ^^^^^^ File "/home/ustc/anaconda3/lib/python3.12/site-packages/accelerate/commands/accelerate_cli.py", line 50, in main args.func(args) File "/home/ustc/anaconda3/lib/python3.12/site-packages/accelerate/commands/launch.py", line 1204, in launch_command multi_gpu_launcher(args) File "/home/ustc/anaconda3/lib/python3.12/site-packages/accelerate/commands/launch.py", line 825, in multi_gpu_launcher distrib_run.run(args) File "/home/ustc/anaconda3/lib/python3.12/site-packages/torch/distributed/run.py", line 892, in run elastic_launch( File "/home/ustc/anaconda3/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 133, in __call__ return launch_agent(self._config, self._entrypoint, list(args)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ustc/anaconda3/lib/python3.12/site-packages/torch/distributed/launcher/api.py", line 264, in launch_agent raise ChildFailedError( torch.distributed.elastic.multiprocessing.errors.ChildFailedError: ============================================================ train_v2.py FAILED ------------------------------------------------------------ Failures: <NO_OTHER_FAILURES> ------------------------------------------------------------ Root Cause (first observed failure): [0]: time : 2025-05-14_21:22:24 host : ustc-SYS-740GP-TNRT rank : 0 (local_rank: 0) exitcode : 1 (pid: 3125404) error_file: <N/A> traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html ============================================================ 分析原因
最新发布
05-15
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值