【部署】手搓一个dify可用的rerank模型接口服务
本地知识库有保密需要,不想上传互联网模型进行rerank处理。但是,vllm部署rerank模型硬件要求高,ollma不支持rerank模型,因此只能考虑按照openai的接口标准写一个rerank的接口程序
1. 安装必要的python基础环境
$ pip install uv -i https://pypi.tuna.tsinghua.edu.cn/simple
$ uv pip install torch -i https://pypi.tuna.tsinghua.edu.cn/simple
下载量比较大,耗时有点长
$ uv pip install modelscope -i https://pypi.tuna.tsinghua.edu.cn/simple
$ uv pip install "modelscope[multi-modal]" -i https://pypi.tuna.tsinghua.edu.cn/simple
$ uv pip install fastapi uvicorn -i https://pypi.tuna.tsinghua.edu.cn/simple
2. 测试一下基础环境是否工作正常
$ uv run python -c “from modelscope.pipelines import pipeline;print(pipeline(‘word-segmentation’)(‘今天天气不错,适合 出去游玩’))”
3. 用模型介绍页的案例测试score打分效果
# test_modelscope.py
import torch
from modelscope import AutoModelForSequenceClassification, AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained('BAAI/bge-reranker-large')
model = AutoModelForSequenceClassification.from_pretrained('BAAI/bge-reranker-large')
model.eval()
pairs = [['what is panda?', 'hi'], ['what is panda?', 'The giant panda (Ailuropoda melanoleuca), sometimes called a panda bear or simply panda, is a bear species endemic to China.']]
with torch.no_grad():
inputs = tokenizer(pairs, padding=True, truncation=

最低0.47元/天 解锁文章
6555





