背景
在按照https://github.com/xlang-ai/instructor-embedding中的指引安装embedding模型instructor时,遇到了多种报错。比如:
INSTRUCTOR._load_sbert_model() got an unexpected keyword argument 'local_files_only'
或
instructor embedding cannot import name 'cached_download' from 'huggingface_hub'
以及
No such file or directory: 'hkunlp/instructor-large/modules.json'
原因是现在的transformers、huggingface库更新速度太快,基本上月更,所以需要修补各种版本上的问题。
解决方案
直接用sentence-transformer框架进行推理,读取权重文件。
pip install sentence-transformers -q
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("hkunlp/instructor-large", trust_remote_code=True)
sentence = "3D ActionSLAM: wearable person tracking in multi-floor environments"
instruction = "Represent the Science title:"
embeddings = model.encode([[instruction,sentence]])
print(embeddings)
注:默认embeddings未经过normalize