Janus-pro 本地运行问题

问题一:安装 torch GPU版本

由于pip安装太慢
阿里云下载对应cuda版本的 torch 包 阿里云镜像站
安装本地下载的 torch 包

pip install C:\Users\xxx\Downloads\torch-2.2.2+cu118-cp310-cp310-win_amd64.whl

问题二:RuntimeError: “triu_tril_cuda_template“ not implemented for ‘BFloat16‘

官方requirements中版本

torch==2.0.1

将版本替换

torch==2.2.2

下载对应的 torch GPU包安装即可

运行示例

import torch
from transformers import AutoModelForCausalLM
from janus.models import MultiModalityCausalLM, VLChatProcessor
from janus.utils.io import load_pil_images


if __name__ == '__main__':
    # 指定模型路径
    model_path = "../Janus-Pro-1B"
    # 加载VLChatProcessor
    vl_chat_processor: VLChatProcessor = VLChatProcessor.from_pretrained(model_path)
    # 加载分词器
    tokenizer = vl_chat_processor.tokenizer
    # 加载vl_gpt
    vl_gpt: MultiModalityCausalLM = AutoModelForCausalLM.from_pretrained(
        model_path, trust_remote_code=True
    )
    vl_gpt = vl_gpt.to(torch.bfloat16).cuda().eval()
    image = "./pic.png"
    # question = "explain this meme"
    question = "这张图片有什么?"
    conversation = [
        {
            "role": "<|User|>",
            "content": f"<image_placeholder>\n{question}",
            "images": [image],
        },
        {"role": "<|Assistant|>", "content": ""},
    ]
    pil_images = load_pil_images(conversation)
    prepare_inputs = vl_chat_processor(
        conversations=conversation, images=pil_images, force_batchify=True
    ).to(vl_gpt.device)
    # # run image encoder to get the image embeddings
    inputs_embeds = vl_gpt.prepare_inputs_embeds(**prepare_inputs)
    print(inputs_embeds)
    # # run the model to get the response
    outputs = vl_gpt.language_model.generate(
        inputs_embeds=inputs_embeds,
        attention_mask=prepare_inputs.attention_mask,
        pad_token_id=tokenizer.eos_token_id,
        bos_token_id=tokenizer.bos_token_id,
        eos_token_id=tokenizer.eos_token_id,
        max_new_tokens=512,
        do_sample=False,
        use_cache=True,
    )
    answer = tokenizer.decode(outputs[0].cpu().tolist(), skip_special_tokens=True)
    print(f"{prepare_inputs['sft_format'][0]}", answer)


评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值