1、错误信息:cannot import name 'notf' from 'tensorboard.compat'
(dl_base) [root@localhost WiNGPT2]# python test.py
[2023-10-08 02:18:35,071] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
Traceback (most recent call last):
File "/home/software/miniconda3/envs/dl_base/lib/python3.9/site-packages/tensorboard/compat/__init__.py", line 42, in tf
from tensorboard.compat import notf # noqa: F401
ImportError: cannot import name 'notf' from 'tensorboard.compat' (/home/software/miniconda3/envs/dl_base/lib/python3.9/site-packages/tensorboard/compat/__init__.py)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/software/miniconda3/envs/dl_base/lib/python3.9/site-packages/transformers/utils/import_utils.py", line 1184, in _get_module
return importlib.import_module("." + module_name, self.__name__)
File "/home/software/miniconda3/envs/dl_base/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
。。。。。。。。
解决方法:
pip uninstall protobuf
pip install protobuf==3.20.0 -i https://pypi.tuna.tsinghua.edu.cn/simple
2、报错信息:
Exception: data did not match any variant of untagged enum ModelWrapper at line 1250992 column 3
解决方法:
pip install -U transformers
3、某张卡明明没有进程使用,但是GPU-utils利用率为100%;
解决方法:
fuser -v /dev/nvidia*
结果如下:
关闭上述进程
kill -9 4075 29223 58864
结果确认: