通义千问-QwQ-32B推理(transformer框架)

本地环境:Linux

python 3.10.16

torch 2.4.0

cuda 12.2

transformer 4.43.2

模型地址:QwQ-32B合集详情-来自Qwen · 魔搭社区

模型下载

modelscope download --model Qwen/QwQ-32B --local_dir ./QwQ-32B

模型推理

import os
from datetime import datetime
import cv2
from PIL import Image

import torch
from transformers import AutoModel, AutoTokenizer, AutoProcessor, AutoModelForCausalLM


def Qwen_QwQ():
    weight_path = 'QwQ-32B'

    model = AutoModelForCausalLM.from_pretrained(
        weight_path,
        torch_dtype="auto",
        device_map="balanced_low_0"
    )

    tokenizer = AutoTokenizer.from_pretrained(weight_path)
    prompt = "what is the day today?"
    messages = [
        {"role": "user", "content": prompt}
    ]
    text = tokenizer.apply_chat_template(
        messages,
        tokenize=False,
        add_generation_prompt=True
    )

    model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

    generated_ids = model.generate(
        **model_inputs,
        max_new_tokens=32768
    )
    generated_ids = [
        output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
    ]

    response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

    print(response)


if __name__ == '__main__':
    Qwen_QwQ()

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值