基于 Python 的自然语言处理系列（71）：LangChain 记忆与多重检索-优快云博客

本文链接：https://blog.youkuaiyun.com/ljd939952281/article/details/146998466

在自然语言处理（NLP）应用中，语言模型（LLM）往往需要与其他组件协同工作，以提升性能和用户体验。本篇文章将继续探索 LangChain，重点介绍其**记忆（Memory）机制和多重检索 QA 链（MultiRetrievalQAChain）**的应用。

1. 记忆（Memory）机制

在对话系统中，记忆对于维护上下文至关重要。LangChain 提供了多种不同类型的记忆方式，以适应不同的应用场景。

1.1 ChatMessageHistory

ChatMessageHistory 是一个轻量级的存储类，主要用于管理 AI 和用户之间的消息历史。

from langchain.memory import ChatMessageHistory

history = ChatMessageHistory()
history.add_user_message("hi!")
history.add_ai_message("whats up?")
history.add_user_message("How are you?")
history.add_ai_message("I'm good. How about you?")

print(history.messages)

1.2 ConversationBufferMemory

ConversationBufferMemory 允许存储整个对话历史，并可用于对话链。

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.save_context({"input": "hi"}, {"output": "whats up"})
print(memory.load_memory_variables({}))

1.3 ConversationBufferWindowMemory

此类型的记忆只保留最近的 K 轮对话，以避免上下文过长。

from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=2)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
print(memory.load_memory_variables({}))

1.4 ConversationEntityMemory

ConversationEntityMemory 记住对话中的实体信息。

from langchain.memory import ConversationEntityMemory

memory = ConversationEntityMemory()
_input = {"input": "Chaky & Gun are working on a NLP course"}
memory.save_context(_input, {"output": "That sounds like a great project!"})
print(memory.load_memory_variables({"input": "Who is Chaky?"}))

2. 多重检索 QA 链（MultiRetrievalQAChain）

在复杂的问答系统中，我们可能需要从多个数据源中检索信息。LangChain 提供 MultiRetrievalQAChain 来自动选择最合适的检索器。

2.1 构建多个检索器

from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceInstructEmbeddings
import torch

embedding_model = HuggingFaceInstructEmbeddings(
    model_name='hkunlp/instructor-base',
    model_kwargs={'device': torch.device('cuda' if torch.cuda.is_available() else 'cpu')}
)

sou_docs = TextLoader('./docs/txt/state_of_the_union.txt').load_and_split()
sou_retriever = FAISS.from_documents(sou_docs, embedding_model).as_retriever()

pg_docs = TextLoader('./docs/txt/paul_graham_essay.txt').load_and_split()
pg_retriever = FAISS.from_documents(pg_docs, embedding_model).as_retriever()

personal_texts = [
    "I love apple pie",
    "My favorite color is fuchsia",
    "My dream is to become a professional dancer",
]
personal_retriever = FAISS.from_texts(personal_texts, embedding_model).as_retriever()

2.2 组装 MultiRetrievalQAChain

from langchain.chains.router import MultiRetrievalQAChain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

prompt_template = "You are a chatbot having a conversation with a human."

default_chain = LLMChain(
    llm=llm,
    prompt=PromptTemplate.from_template(prompt_template)
)

retriever_infos = [
    {"name": "state of the union", "description": "Good for answering questions about the 2023 State of the Union address", "retriever": sou_retriever},
    {"name": "pg essay", "description": "Good for answering questions about Paul Graham's essay", "retriever": pg_retriever},
    {"name": "personal", "description": "Good for answering questions about me", "retriever": personal_retriever},
]

chain = MultiRetrievalQAChain.from_retrievers(
    llm=llm,
    retriever_infos=retriever_infos,
    default_chain=default_chain,
    verbose=True
)

2.3 运行 QA 查询

print(chain.run("What did the president say about the economy?"))
print(chain.run("What is something Paul Graham regrets about his work?"))
print(chain.run("What is my background?"))

3. 总结

在本篇文章中，我们探讨了 LangChain 提供的记忆机制，包括 ChatMessageHistory、ConversationBufferMemory、ConversationBufferWindowMemory 和 ConversationEntityMemory，并演示了它们的使用场景。同时，我们介绍了 MultiRetrievalQAChain，展示了如何结合多个检索器提升问答系统的性能。

LangChain 的这些特性为 NLP 应用提供了更强大的能力，能够处理更复杂的对话和问答任务。在下一篇文章中，我们将继续深入探讨 LangChain 的高级功能，敬请期待！

如果你觉得这篇博文对你有帮助，请点赞、收藏、关注我，并且可以打赏支持我！

欢迎关注我的后续博文，我将分享更多关于人工智能、自然语言处理和计算机视觉的精彩内容。

谢谢大家的支持！