【部署】手搓一个dify可用的rerank模型接口服务

原创

已于 2025-05-29 07:15:11 修改 · 2.8k 阅读

19 ·

CC 4.0 BY-SA版权

文章标签：

#python #开发语言 #dify #rerank #modelscope

于 2025-05-28 17:43:06 首次发布

回到目录

【部署】手搓一个dify可用的rerank模型接口服务

本地知识库有保密需要，不想上传互联网模型进行rerank处理。但是，vllm部署rerank模型硬件要求高，ollma不支持rerank模型，因此只能考虑按照openai的接口标准写一个rerank的接口程序

1. 安装必要的python基础环境

 $ pip install uv -i https://pypi.tuna.tsinghua.edu.cn/simple
 $ uv pip install torch -i https://pypi.tuna.tsinghua.edu.cn/simple
 下载量比较大，耗时有点长
 $ uv pip install modelscope -i https://pypi.tuna.tsinghua.edu.cn/simple
 $ uv pip install "modelscope[multi-modal]" -i https://pypi.tuna.tsinghua.edu.cn/simple
 $ uv pip install fastapi uvicorn  -i https://pypi.tuna.tsinghua.edu.cn/simple

2. 测试一下基础环境是否工作正常

$ uv run python -c “from modelscope.pipelines import pipeline;print(pipeline(‘word-segmentation’)(‘今天天气不错，适合出去游玩’))”

3. 用模型介绍页的案例测试score打分效果

# test_modelscope.py

import torch
from modelscope import AutoModelForSequenceClassification, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large')

model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-large')
model.eval()

pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
with torch.no_grad():
    inputs = tokenizer(pairs, padding=True, truncation=