我跑的是training/pipeline_parallelism的任务,配置的launch.json如下,
{
"version": "0.2.0",
"configurations": [
{
"name": "pp",
"type": "python",
"request": "launch",
"program": "/home/user/anaconda3/envs/deepspeed/bin/deepspeed",
"cwd": "/home/user/DeepSpeedExamples/training/pipeline_parallelism",
"console": "integratedTerminal",
"justMyCode": false,
"args": [
"--master_port", "12346",
"train.py",
"--deepspeed_config",
"ds_config.json",
"--p",
"2",
"--steps",
"60",
]
}
]
}
之前debug的时候一直会出现“no module named train.py”, 后来一气之下装了最新版本的VS code,然后就可以了 ,谢天谢地,大家可以试试是不是这样。