GPUStack昇腾Atlas300I duo部署模型DeepSeek-R1【GPUStack实战篇2】

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

2025年4月25日GPUStack发布了v0.6版本,为昇腾芯片910B(1-4)和310P3内置了MinIE推理,新增了310P芯片的支持,很感兴趣,所以我马上来捣鼓玩玩看哈
官方文档https://docs.gpustack.ai/latest/installation/ascend-cann/online-installation/
目前GPUStack的Ascend MindIE推理引擎支持的模型列表https://www.hiascend.com/document/detail/zh/mindie/100/whatismindie/mindie_what_0003.html


部署GPUStack

可以参考我之前写的:鲲鹏+昇腾部署集群管理软件GPUStack,两台服务器搭建双节点集群【实战详细踩坑篇】

启动并创建容器:

docker run -d --name gpustack \
    --restart=unless-stopped \
    --device /dev/davinci0 \
    --device /dev/davinci1 \
    --device /dev/davinci_manager \
    --device /dev/devmm_svm \
    --device /dev/hisi_hdc \
    -v /usr/local/dcmi:/usr/local/dcmi \
    -v /usr/local/bin/npu-smi:/usr/local/bin/npu-smi \
    -v /usr/local/Ascend/driver/lib64/:/usr/local/Ascend/driver/lib64/ \
    -v /usr/local/Ascend/driver/version.info:/usr/local/Ascend/driver/version.info \
    -v /etc/ascend_install.info:/etc/ascend_install.info \
    --network=host \
    --ipc=host \
    -v gpustack-data:/var/lib/gpustack \
    gpustack/gpustack:latest-npu-310p

部署DeepSeek-R1

在这里插入图片描述
(1)登录后选择模型,搜索:deepseek,选择deepseek-ai/DeepSeek-R1-Distill-Qwen-7B模型,后端选择:Ascend MindIE,然后点保存。

在这里插入图片描述

下载完成后运行报错,查了一下,目前适配的Ascend MindIE是1.0.0版本,还没适配DeepSeek-R1!

在这里插入图片描述

以下为报错日志:

2025-04-27 06:42:05,038 [ERROR] model.py:39 - [Model]	>>> Exception:call aclnnInplaceZero failed, detail:EZ9999: Inner Error!
EZ9999: [PID: 43453] 2025-04-27-06:42:05.032.406 Parse dynamic kernel config fail.
        TraceBack (most recent call last):
       AclOpKernelInit failed opType
       ZerosLike ADD_TO_LAUNCHER_LIST_AICORE failed.

[ERROR] 2025-04-27-06:42:05 (PID:43453, Device:0, RankID:-1) ERR01100 OPS call acl api failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/dist-packages/model_wrapper/model.py", line 37, in initialize
    return self.python_model.initialize(config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/model_wrapper/standard_model.py", line 146, in initialize
    self.generator = Generator(
                     ^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/generator.py", line 119, in __init__
    self.warm_up(max_prefill_tokens, max_seq_len, max_input_len, max_iter_times, inference_mode)
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/generator.py", line 303, in warm_up
    raise e
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/generator.py", line 296, in warm_up
    self._generate_inputs_warm_up_backend(input_metadata, inference_mode, dummy=True)
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/generator.py", line 378, in _generate_inputs_warm_up_backend
    self.generator_backend.warm_up(model_inputs, inference_mode=inference_mode)
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/adapter/generator_torch.py", line 198, in warm_up
    super().warm_up(model_inputs)
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/adapter/generator_backend.py", line 170, in warm_up
    _ = self.forward(model_inputs, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/utils/decorators/time_decorator.py", line 38, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/text_generator/adapter/generator_torch.py", line 153, in forward
    logits = self.model_wrapper.forward(model_inputs, self.cache_pool.npu_cache, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/modeling/model_wrapper/atb/atb_model_wrapper.py", line 89, in forward
    logits = self.forward_tensor(
             ^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/dist-packages/mindie_llm/modeling/model_wrapper/atb/atb_model_wrapper.py", line 116, in forward_tensor
    logits = self.model_runner.forward(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Ascend/atb-models/atb_llm/runner/model_runner.py", line 193, in forward
    return self.model.forward(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Ascend/atb-models/atb_llm/models/base/flash_causal_lm.py", line 452, in forward
    self.init_ascend_weight()
  File "/usr/local/Ascend/atb-models/atb_llm/models/qwen2/flash_causal_qwen2.py", line 150, in init_ascend_weight
    weight_wrapper = self.get_weights()
                     ^^^^^^^^^^^^^^^^^^
  File "/usr/local/Ascend/atb-models/atb_llm/models/qwen2/flash_causal_qwen2.py", line 132, in get_weights
    weight_wrapper = WeightWrapper(self.soc_info, self.tp_rank, attn_wrapper, mlp_wrapper)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/Ascend/atb-models/atb_llm/utils/data/weight_wrapper.py", line 49, in __init__
    self.placeholder = torch.zeros(1, dtype=torch.float16, device="npu")
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: call aclnnInplaceZero failed, detail:EZ9999: Inner Error!
EZ9999: [PID: 43453] 2025-04-27-06:42:05.032.406 Parse dynamic kernel config fail.
        TraceBack (most recent call last):
       AclOpKernelInit failed opType
       ZerosLike ADD_TO_LAUNCHER_LIST_AICORE failed.

[ERROR] 2025-04-27-06:42:05 (PID:43453, Device:0, RankID:-1) ERR01100 OPS call acl api failed
2025-04-27 06:42:05,042 [ERROR] model.py:42 - [Model]	>>> return initialize error result: {'status': 'error', 'npuBlockNum': '0', 'cpuBlockNum': '0'}
[2025-04-27 06:42:05.146668+00:00] [43225] [43226] [server] [WARN] [llm_daemon.cpp:64] : [Daemon] received exit signal[17]
[2025-04-27 06:42:05.146771+00:00] [43225] [43226] [server] [INFO] [llm_daemon.cpp:69] : Daemon wait pid with 43453, status 9
[2025-04-27 06:42:05.146776+00:00] [43225] [43226] [server] [ERROR] [llm_daemon.cpp:74] : ERR: Daemon wait pid with 43453 exit, Please check the service log or python log.
[ERROR] TBE(43892,python3):2025-04-27-06:42:05.253.179 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43893,python3):2025-04-27-06:42:05.253.179 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43891,python3):2025-04-27-06:42:05.253.179 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43890,python3):2025-04-27-06:42:05.253.179 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43888,python3):2025-04-27-06:42:05.253.207 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43887,python3):2025-04-27-06:42:05.253.222 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43889,python3):2025-04-27-06:42:05.253.262 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE(43886,python3):2025-04-27-06:42:05.253.290 [../../../../../../latest/python/site-packages/tbe/common/repository_manager/utils/repository_manager_log.py:30][log] [../../../../../../latest/python/site-packages/tbe/common/repository_manager/route.py:65][repository_manager] Subprocess[task_distribute] raise error[]
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
[ERROR] TBE Subprocess[task_distribute] raise error[], main process disappeared!
/usr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 30 leaked semaphore objects to clean up at shutdown
  warnings.warn('resource_tracker: There appear to be %d '
Daemon is killing...
[2025-04-27 06:42:10.147021][43225][localhost.localdomain][system][stop][endpoint][success]
[2025-04-27 06:42:10.147044][43225][localhost.localdomain][system][stop][mindie server][success]

部署Qwen2.5模型

目前看qwen2.5系列是支持的,所以尝试一下
在这里插入图片描述

在这里插入图片描述
在这里插入图片描述
也不行,我是真服了,报错日志和上面一模一样


发现问题了,原来也是要改模型权重的精度,不支持BF16,需要改成FP16
在这里插入图片描述
运行成功了!

在这里插入图片描述

测试对话

在这里插入图片描述
低参数基本都会有这个问题,我改回测高参数的模型,测试Qwen2.5-7B-Instruct正常

在这里插入图片描述
Deepseek-R1-qwen-7b,回答有点错乱,高参数的应该没事
在这里插入图片描述
Deepseek-R1-qwen-14b测试正常
在这里插入图片描述
在这里插入图片描述

<think>好的,用户的问题是在Atlas 300I Duo部署DeepSeek模型的教程。我需要先理清用户的真实需求,可能他们有一个具体的硬件环境,想要运行大语言模型,但遇到了一些挑战。用户提到的Atlas 300I Duo是华为的加速卡,可能需要特定的框架和配置。 首先,我需要回顾已有的引用内容。引用[3]和[4]提到了DeepSeek-R1和mindie框架的部署步骤,包括修改config.json模型路径,使用华为的环境搭建。引用[2]提到用户可能因为硬件限制选择了较小的模型,比如deepseek-r1.5b,这说明在资源有限的设备上,选择适合的模型版本很重要。 接下来,用户需要的是具体的步骤教程,可能包括环境准备、模型下载、配置修改、推理测试等。需要分步骤说明,确保结构清晰。同时,用户可能关心如何获取和配置必要的软件,如mindie推理框架,以及如何处理模型文件的权限问题,如引用[4]中提到的chmod命令。 还要注意模型路径的正确设置,这在引用[3]中被强调。另外,用户可能需要了解如何验证部署是否成功,比如运行简单的推理测试。此外,兼容性问题和性能优化也是潜在的需求,比如模型量化或使用华为的CANN加速库。 需要确保回答中的步骤准确,并引用相关的引用来源。同时,生成的相关问题需要覆盖用户可能进一步询问的方向,比如性能优化、环境配置问题、应用场景等。最后,保持回答的中文,使用正确的LaTeX格式,并检查所有引用标识是否正确对应。</think>在Atlas 300I Duo部署DeepSeek模型的步骤如下: ### 一、环境准备 1. **硬件要求**:确保Atlas 300I Duo已安装并配置华为昇腾CANN软件栈(推荐6.0+版本) 2. **框架选择**:需使用适配昇腾芯片的推理框架,如华为提供的`mindie`框架[^4] 3. **模型下载**: ```bash # 下载DeepSeek模型(建议选择适配昇腾的版本) modelscope download --model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B # [^1] ``` 注意:若硬件资源有限,可选择较小模型如`deepseek-r1.5b`[^2] ### 二、模型部署 1. **配置文件修改**: - 进入模型根目录 - 修改`config.json`文件,设置正确的模型路径: ```json { "model_path": "/home/[user]/modelscope/deepseek/distill-32b" # [^3] } ``` - 调整文件权限: ```bash chmod 640 config.json # [^4] ``` 2. **框架适配**: - 从昇腾社区下载`mindie`推理框架 -模型加载到框架中: ```python from mindie import InferenceEngine engine = InferenceEngine(model_path="/path/to/model") ``` ### 三、推理测试 ```python # 示例代码 inputs = "中国的首都是哪里?" outputs = engine.generate(inputs, max_length=50) print(outputs) ``` 注意:首次运行需进行算子编译,耗时约5-15分钟[^3] ### 四、性能优化 1. 开启混合精度模式 2. 使用`AscendCL`进行算子加速 3. 对模型进行量化压缩(建议使用华为昇腾模型压缩工具)
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值