langchain 如何使用本地大模型（LLM）

最新推荐文章于 2025-10-23 10:30:20 发布

原创

最新推荐文章于 2025-10-23 10:30:20 发布 · 7.3k 阅读

33 ·

CC 4.0 BY-SA版权

文章标签：

#langchain #深度学习 #人工智能

本文详细介绍了如何在Langchain中，特别是HuggingFacePipeline的上下文中，加载并使用本地的预训练Transformer模型进行文本生成，强调了device_map设置的重要性，以防运行时错误或设备不匹配的问题。

langchain 很多例子里面，默认都是调用的OpenAI的模型，但是有时候我们希望使用自己本地的大模型。具体代码如下：

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from langchain import LLMChain,HuggingFacePipeline,PromptTemplate
import torch

model_path = "写入模型存在路径"
device = torch.device("cuda:0")
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_path, trust_remote_code=True, device_map="auto").half()
pipe = pipeline(
    "text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=512,
    top_p=1,
    repetition_penalty=1.15
)
llama_model = HuggingFacePipeline(pipeline=pipe)
template = '''
#背景信息# 
你是一名知识丰富的导航助手，了解中国每一个地方的名胜古迹及旅游景点. 
#问题# 
游客:我想去{地方}旅游，给我推荐一下值得玩的地方?"
'''
prompt = PromptTemplate(
    input_variables=["地方"],
    template=template
)
chain = LLMChain(llm=llama_model, prompt=prompt)
print(chain.run("天津"))

注意：

这行代码一定要写成 device_