1 系统环境
硬件环境(Ascend/GPU/CPU): Ascend/GPU/CPU
MindSpore版本: 2.1
执行模式(PyNative/ Graph): 不限
2 报错信息
2.1 问题描述
使用离线切分方法切好模型后,加载切分后的模型进行训练,出现以下报错:
Traceback (most recent call last): File "wizardcoder/ run_wizardcoder.py", line 148, in <module> device_id=args.device_id) File "wizardcoder/ run_wizardcoder.py", line 90, in main task.finetune(finetune_checkpoint=config. load_checkpoint, auto_trans_ckpt=config.auto_trans_ckpt, resume=resume) File "/home/wizardcoder/1_wizardcoder-mindformers-916/mindformers/trainer/trainer.py", Tine 522, in finetune is full config=True. **kwargs) File "/home/wizardcoder/1 wizardcoder-mindformers-916/mindforme rs/trainer/caus al_language_modeling/caus al_language_modeling.py", line 106, in train **kwargs) File "/home/wizardcode r/1_wizardcoder-mindformers -916/mindformers/trainer/base_t rainer.py", line 616, in training_ process transform and load checkpoint(c onfig, model, network, dataset) File "/home/wizardcoder/1 wizardcodermindformers -916/mindformers/trainer/utils .py", line 300, in transform and_load checkpoint build model(config, model. datas et, do eval=do eval, do predict-do predict) File "/home/wizardcoder/1 wizardcoder-mindformers-916/mindformers/trainer/utils.py". line 317, in build model raise ValueError( "when distributed loads are s liced weights. sink mode must be set True
ValueError: When dıstrıbuted loads are slıced we1ghts, sınk mode must be set True.
复制
3 根因分析
分析配置文件可知,此时开启了profile功能,当前不支持两者同时使用。
4 解决方案
不同时使用这两个功能。