/home/wiseatc/.local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
W0703 16:13:22.516433 3913223 torch/distributed/run.py:766]
W0703 16:13:22.516433 3913223 torch/distributed/run.py:766] *****************************************
W0703 16:13:22.516433 3913223 torch/distributed/run.py:766] Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
W0703 16:13:22.516433 3913223 torch/distributed/run.py:766] *****************************************
/home/wiseatc/.local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
/home/wiseatc/.local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
/home/wiseatc/.local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
/home/wiseatc/.local/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
[rank0]: Traceback (most recent call last):
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1863, in _get_module
[rank0]: return importlib.import_module("." + module_name, self.__name__)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
[rank0]: return _bootstrap._gcd_import(name[level:], package, level)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[rank0]: File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[rank0]: File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
[rank0]: File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
[rank0]: File "<frozen importlib._bootstrap_external>", line 940, in exec_module
[rank0]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 29, in <module>
[rank0]: from .tokenization_llama import LlamaTokenizer
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama.py", line 27, in <module>
[rank0]: import sentencepiece as spm
[rank0]: File "/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py", line 10, in <module>
[rank0]: from . import _sentencepiece
[rank0]: ImportError: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank0]: The above exception was the direct cause of the following exception:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 82, in load_tokenizer
[rank0]: tokenizer = AutoTokenizer.from_pretrained(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 912, in from_pretrained
[rank0]: tokenizer_class_from_name(config_tokenizer_class) is not None
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 611, in tokenizer_class_from_name
[rank0]: return getattr(module, class_name)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1851, in __getattr__
[rank0]: module = self._get_module(self._class_to_module[name])
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1865, in _get_module
[rank0]: raise RuntimeError(
[rank0]: RuntimeError: Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback):
[rank0]: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank0]: The above exception was the direct cause of the following exception:
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank0]: launch()
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank0]: run_exp()
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
[rank0]: _training_function(config={"args": args, "callbacks": callbacks})
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
[rank0]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
[rank0]: tokenizer_module = load_tokenizer(model_args)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 97, in load_tokenizer
[rank0]: raise OSError("Failed to load tokenizer.") from e
[rank0]: OSError: Failed to load tokenizer.
[rank3]: Traceback (most recent call last):
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1863, in _get_module
[rank3]: return importlib.import_module("." + module_name, self.__name__)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
[rank3]: return _bootstrap._gcd_import(name[level:], package, level)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[rank3]: File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[rank3]: File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
[rank3]: File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
[rank3]: File "<frozen importlib._bootstrap_external>", line 940, in exec_module
[rank3]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 29, in <module>
[rank3]: from .tokenization_llama import LlamaTokenizer
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama.py", line 27, in <module>
[rank3]: import sentencepiece as spm
[rank3]: File "/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py", line 10, in <module>
[rank3]: from . import _sentencepiece
[rank3]: ImportError: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank3]: The above exception was the direct cause of the following exception:
[rank3]: Traceback (most recent call last):
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 82, in load_tokenizer
[rank3]: tokenizer = AutoTokenizer.from_pretrained(
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 912, in from_pretrained
[rank3]: tokenizer_class_from_name(config_tokenizer_class) is not None
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 611, in tokenizer_class_from_name
[rank3]: return getattr(module, class_name)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1851, in __getattr__
[rank3]: module = self._get_module(self._class_to_module[name])
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1865, in _get_module
[rank3]: raise RuntimeError(
[rank3]: RuntimeError: Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback):
[rank3]: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank3]: The above exception was the direct cause of the following exception:
[rank3]: Traceback (most recent call last):
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank3]: launch()
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank3]: run_exp()
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
[rank3]: _training_function(config={"args": args, "callbacks": callbacks})
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
[rank3]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
[rank3]: tokenizer_module = load_tokenizer(model_args)
[rank3]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank3]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 97, in load_tokenizer
[rank3]: raise OSError("Failed to load tokenizer.") from e
[rank3]: OSError: Failed to load tokenizer.
[rank1]: Traceback (most recent call last):
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1863, in _get_module
[rank1]: return importlib.import_module("." + module_name, self.__name__)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
[rank1]: return _bootstrap._gcd_import(name[level:], package, level)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[rank1]: File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[rank1]: File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
[rank1]: File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
[rank1]: File "<frozen importlib._bootstrap_external>", line 940, in exec_module
[rank1]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 29, in <module>
[rank1]: from .tokenization_llama import LlamaTokenizer
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama.py", line 27, in <module>
[rank1]: import sentencepiece as spm
[rank1]: File "/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py", line 10, in <module>
[rank1]: from . import _sentencepiece
[rank1]: ImportError: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank1]: The above exception was the direct cause of the following exception:
[rank1]: Traceback (most recent call last):
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 82, in load_tokenizer
[rank1]: tokenizer = AutoTokenizer.from_pretrained(
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 912, in from_pretrained
[rank1]: tokenizer_class_from_name(config_tokenizer_class) is not None
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 611, in tokenizer_class_from_name
[rank1]: return getattr(module, class_name)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1851, in __getattr__
[rank1]: module = self._get_module(self._class_to_module[name])
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1865, in _get_module
[rank1]: raise RuntimeError(
[rank1]: RuntimeError: Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback):
[rank1]: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank1]: The above exception was the direct cause of the following exception:
[rank1]: Traceback (most recent call last):
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank1]: launch()
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank1]: run_exp()
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
[rank1]: _training_function(config={"args": args, "callbacks": callbacks})
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
[rank1]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
[rank1]: tokenizer_module = load_tokenizer(model_args)
[rank1]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank1]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 97, in load_tokenizer
[rank1]: raise OSError("Failed to load tokenizer.") from e
[rank1]: OSError: Failed to load tokenizer.
[rank2]: Traceback (most recent call last):
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1863, in _get_module
[rank2]: return importlib.import_module("." + module_name, self.__name__)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/usr/lib/python3.11/importlib/__init__.py", line 126, in import_module
[rank2]: return _bootstrap._gcd_import(name[level:], package, level)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
[rank2]: File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
[rank2]: File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
[rank2]: File "<frozen importlib._bootstrap>", line 690, in _load_unlocked
[rank2]: File "<frozen importlib._bootstrap_external>", line 940, in exec_module
[rank2]: File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama_fast.py", line 29, in <module>
[rank2]: from .tokenization_llama import LlamaTokenizer
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/llama/tokenization_llama.py", line 27, in <module>
[rank2]: import sentencepiece as spm
[rank2]: File "/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py", line 10, in <module>
[rank2]: from . import _sentencepiece
[rank2]: ImportError: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank2]: The above exception was the direct cause of the following exception:
[rank2]: Traceback (most recent call last):
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 82, in load_tokenizer
[rank2]: tokenizer = AutoTokenizer.from_pretrained(
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 912, in from_pretrained
[rank2]: tokenizer_class_from_name(config_tokenizer_class) is not None
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/models/auto/tokenization_auto.py", line 611, in tokenizer_class_from_name
[rank2]: return getattr(module, class_name)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1851, in __getattr__
[rank2]: module = self._get_module(self._class_to_module[name])
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/usr/local/lib/python3.11/dist-packages/transformers/utils/import_utils.py", line 1865, in _get_module
[rank2]: raise RuntimeError(
[rank2]: RuntimeError: Failed to import transformers.models.llama.tokenization_llama_fast because of the following error (look up to see its traceback):
[rank2]: cannot import name '_sentencepiece' from partially initialized module 'sentencepiece' (most likely due to a circular import) (/usr/local/lib/python3.11/dist-packages/sentencepiece/__init__.py)
[rank2]: The above exception was the direct cause of the following exception:
[rank2]: Traceback (most recent call last):
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 23, in <module>
[rank2]: launch()
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py", line 19, in launch
[rank2]: run_exp()
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 110, in run_exp
[rank2]: _training_function(config={"args": args, "callbacks": callbacks})
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/tuner.py", line 72, in _training_function
[rank2]: run_sft(model_args, data_args, training_args, finetuning_args, generating_args, callbacks)
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/train/sft/workflow.py", line 48, in run_sft
[rank2]: tokenizer_module = load_tokenizer(model_args)
[rank2]: ^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank2]: File "/home/wiseatc/LLaMA-Factory/src/llamafactory/model/loader.py", line 97, in load_tokenizer
[rank2]: raise OSError("Failed to load tokenizer.") from e
[rank2]: OSError: Failed to load tokenizer.
[rank0]:[W703 16:13:30.861219244 ProcessGroupNCCL.cpp:1479] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see https://pytorch.org/docs/stable/distributed.html#shutdown (function operator())
W0703 16:13:31.449512 3913223 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 3913282 closing signal SIGTERM
W0703 16:13:31.450263 3913223 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 3913283 closing signal SIGTERM
W0703 16:13:31.450724 3913223 torch/distributed/elastic/multiprocessing/api.py:900] Sending process 3913284 closing signal SIGTERM
E0703 16:13:31.765744 3913223 torch/distributed/elastic/multiprocessing/api.py:874] failed (exitcode: 1) local_rank: 0 (pid: 3913281) of binary: /usr/bin/python3.11
Traceback (most recent call last):
File "/usr/local/bin/torchrun", line 8, in <module>
sys.exit(main())
^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/distributed/elastic/multiprocessing/errors/__init__.py", line 355, in wrapper
return f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/distributed/run.py", line 892, in main
run(args)
File "/usr/local/lib/python3.11/dist-packages/torch/distributed/run.py", line 883, in run
elastic_launch(
File "/usr/local/lib/python3.11/dist-packages/torch/distributed/launcher/api.py", line 139, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/dist-packages/torch/distributed/launcher/api.py", line 270, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2025-07-03_16:13:31
host : wiseatc-Super-Server
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 3913281)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
Traceback (most recent call last):
File "/home/wiseatc/.local/bin/llamafactory-cli", line 8, in <module>
sys.exit(main())
^^^^^^
File "/home/wiseatc/LLaMA-Factory/src/llamafactory/cli.py", line 130, in main
process = subprocess.run(
^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/subprocess.py", line 569, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['torchrun', '--nnodes', '1', '--node_rank', '0', '--nproc_per_node', '4', '--master_addr', '127.0.0.1', '--master_port', '38589', '/home/wiseatc/LLaMA-Factory/src/llamafactory/launcher.py', 'saves/DeepSeek-R1-1.5B-Distill/lora/train_2025-07-03-16-00-01/training_args.yaml']' returned non-zero exit status 1.
最新发布