使用昇腾原生支持的三方库,全部利用外部公开资源,快速体验最近很火的o1推理

导言

不是技术贴,算是个科普贴
最近很多人咨询,GPU的资源越来越难以获取,有没有昇腾的资源可以使用
下面展示了一条相对简单能运行模型的资源路径

拉起一个模型最快的方式是使用已经集成好的三方库,本文使用了昇腾原生支持的Huggingface-Transformers套件

昇腾原生适配的三方库在社区文档 中提供了清单。若有更多三方库适配需求,可在社区论坛 中提出,欢迎反馈

当下应该是OpenAI-o1比较火,那就用这些资源拉起一个o1数学模型试试推理效果吧

资源获取

所需资源提供方获取途径
Device: NPU启智社区积分制社区,注册既有50积分,我们本文使用的机器每小时按4积分计算,所以足够了。点击启智社区 官网使用
Requirements:
torch=2.1
torch_npu=2.1
昇腾社区torch_npu在社区文档 中获取
三方库:
Transformers=4.43.2
HuggingFace社区昇腾原生适配的三方库在社区文档 中提供了清单。若有更多三方库适配需求,可在社区论坛 中提出,欢迎反馈
模型天工 HF社区上传最近一周o1模型开源集中爆发,天工发布了Skywork-o1-Open-Llama-3.1-8B 可直接在HF社区下载
下载工具HF-Mirror国内在HF下载受阻,参照HF-Mirror设置 环境变量即可加速下载

快速实践

1. 创建一个NPU环境

启智社区的使用请参考社区文档。

启智社区 --> 个人中心 --> 云脑任务 --> 新建云脑任务 --> 调试任务 --> 昇腾NPU

资源规格选择NPU: 1*Ascend-D910(显存: 32GB), CPU: 20, 内存: 60GB

镜像选择torch-npu-cann8-debug

2. 依赖安装

镜像中提供的torch是2.1.0版本,匹配的Transformers是4.41.2。

但我们需要运行的Skywork-o1-Open-Llama-3.1-8B入参校验需要的Transformers版本至少要求4.43.2以上,所以需要更新一下。

pip install transformers==4.43.2

为加快加快速度,需设置环境变量

export HF_ENDPOINT=https://hf-mirror.com
3. 启动脚本

导入依赖

import torch
import torch_npu
from transformers import AutoModelForCausalLM, AutoTokenizer

设定问答模板

system_prompt: 你是Skywork-o1,Skywork AI开发的思维模型,擅长通过深入思考解决涉及数学、编码和逻辑推理的复杂问题。当面对用户的请求时,你首先要进行漫长而深入的思考过程,以探索问题的可能解决方案。完成你的想法后,你在回复中提供对解决方案过程的详细说明。

problem: Jane有12个苹果。她把4个苹果给了她的朋友Mark,然后又买了1个苹果,最后把所有的苹果平均分给了她自己和2个兄弟姐妹。请问最后每人得到多少个苹果?

system_prompt = """You are Skywork-o1, a thinking model developed by Skywork AI, specializing in solving complex problems involving mathematics, coding, and logical reasoning through deep thought. When faced with a user's request, you first engage in a lengthy and in-depth thinking process to explore possible solutions to the problem. After completing your thoughts, you then provide a detailed explanation of the solution process in your response."""

# An Example Case
problem = "Jane has 12 apples. She gives 4 apples to her friend Mark, then buys 1 more apple, and finally splits all her apples equally among herself and her 2 siblings. How many apples does each person get?"

user_message = problem

conversation = [
    {
        "role": "system",
        "content": system_prompt
    },
    {
        "role": "user", 
        "content": user_message
    }
]

指定模型为"Skywork/Skywork-o1-Open-Llama-3.1-8B"

model_name = "Skywork-o1-Open-Llama3.1-8B"
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype="auto",
    device_map="auto"
)

tokenizer = AutoTokenizer.from_pretrained(model_name)

input_ids = tokenizer.apply_chat_template(
    conversation, 
    tokenize=True, 
    add_generation_prompt=True,
    return_tensors="pt").to(model.device)

完成推理

generation = model.generate(
    input_ids=input_ids,
    max_new_tokens=2048,
    do_sample=False,
    pad_token_id=128009,
    temperature=0)

completion = tokenizer.decode(
    generation[0][len(input_ids[0]):], 
    skip_special_tokens=True, 
    clean_up_tokenization_spaces=True)

print(completion)
4. 输出流
Loading checkpoint shards:   0%|                                                                                                                            
Loading checkpoint shards:  25%|████████████████████████████████████████▌                                                                                   
Loading checkpoint shards:  50%|█████████████████████████████████████████████████████████████████████████████████                                           
Loading checkpoint shards:  75%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▌  
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
Loading checkpoint shards: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:03<00:00,  1.01it/s]
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:567: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
/home/ma-user/anaconda3/envs/MindSpore/lib/python3.9/site-packages/transformers/generation/configuration_utils.py:572: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.9` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
  warnings.warn(
The attention mask is not set and cannot be inferred from input because pad token is same as eos token.As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
[W VariableFallbackKernel.cpp:51] Warning: CAUTION: The operator 'aten::isin.Tensor_Tensor_out' is not currently supported on the NPU backend and will fall back to run on the CPU. This may have performance implications. (function npu_cpu_fallback)
[W NeKernelNpu.cpp:28] Warning: The oprator of ne is executed, Currently High Accuracy but Low Performance OP with 64-bit has been used, Please Do Some Cast at Python Functions with 32-bit for Better Performance! (function operator())
[W AddKernelNpu.cpp:82] Warning: The oprator of add is executed, Currently High Accuracy but Low Performance OP with 64-bit has been used, Please Do Some Cast at Python Functions with 32-bit for Better Performance! (function operator())
5. 推理结果

可以看到完整的思维链过程,按照题干分三步完成了推理。

To solve the problem, let's break it down into a series of logical steps:
1. **Initial Number of Apples**: Jane starts with 12 apples.
2. **Apples Given Away**: Jane gives 4 apples to her friend Mark. So, the number of apples she has now is:
   \[
   12 - 4 = 8
   \]
3. **Apples Bought**: Jane then buys 1 more apple. So, the number of apples she has now is:
   \[
   8 + 1 = 9
   \]
4. **Apples Split Equally**: Jane splits all her apples equally among herself and her 2 siblings. This means the apples are divided among 3 people. So, the number of apples each person gets is:
   \[
   \frac{9}{3} = 3
   \]

Therefore, each person gets \(\boxed{3}\) apples.

除去注册网站账号和编译时间,大概十分钟就能完成这份快速上手实践。感兴趣可以参照体验下。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

小白 AI 日记

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值