在自然语言处理(NLP)应用中,语言模型(LLM)往往需要与其他组件协同工作,以提升性能和用户体验。本篇文章将继续探索 LangChain,重点介绍其**记忆(Memory)机制和多重检索 QA 链(MultiRetrievalQAChain)**的应用。
1. 记忆(Memory)机制
在对话系统中,记忆对于维护上下文至关重要。LangChain 提供了多种不同类型的记忆方式,以适应不同的应用场景。
1.1 ChatMessageHistory
ChatMessageHistory
是一个轻量级的存储类,主要用于管理 AI 和用户之间的消息历史。
from langchain.memory import ChatMessageHistory
history = ChatMessageHistory()
history.add_user_message("hi!")
history.add_ai_message("whats up?")
history.add_user_message("How are you?")
history.add_ai_message("I'm good. How about you?")
print(history.messages)
1.2 ConversationBufferMemory
ConversationBufferMemory
允许存储整个对话历史,并可用于对话链。
from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.save_context({"input": "hi"}, {"output": "whats up"})
print(memory.load_memory_variables({}))
1.3 ConversationBufferWindowMemory
此类型的记忆只保留最近的 K 轮对话,以避免上下文过长。
from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=2)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
print(memory.load_memory_variables({}))
1.4 ConversationEntityMemory
ConversationEntityMemory
记住对话中的实体信息。
from langchain.memory import ConversationEntityMemory
memory = ConversationEntityMemory()
_input = {"input": "Chaky & Gun are working on a NLP course"}
memory.save_context(_input, {"output": "That sounds like a great project!"})
print(memory.load_memory_variables({"input": "Who is Chaky?"}))
2. 多重检索 QA 链(MultiRetrievalQAChain)
在复杂的问答系统中,我们可能需要从多个数据源中检索信息。LangChain 提供 MultiRetrievalQAChain
来自动选择最合适的检索器。
2.1 构建多个检索器
from langchain.document_loaders import TextLoader
from langchain.vectorstores import FAISS
from langchain.embeddings import HuggingFaceInstructEmbeddings
import torch
embedding_model = HuggingFaceInstructEmbeddings(
model_name='hkunlp/instructor-base',
model_kwargs={'device': torch.device('cuda' if torch.cuda.is_available() else 'cpu')}
)
sou_docs = TextLoader('./docs/txt/state_of_the_union.txt').load_and_split()
sou_retriever = FAISS.from_documents(sou_docs, embedding_model).as_retriever()
pg_docs = TextLoader('./docs/txt/paul_graham_essay.txt').load_and_split()
pg_retriever = FAISS.from_documents(pg_docs, embedding_model).as_retriever()
personal_texts = [
"I love apple pie",
"My favorite color is fuchsia",
"My dream is to become a professional dancer",
]
personal_retriever = FAISS.from_texts(personal_texts, embedding_model).as_retriever()
2.2 组装 MultiRetrievalQAChain
from langchain.chains.router import MultiRetrievalQAChain
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain
prompt_template = "You are a chatbot having a conversation with a human."
default_chain = LLMChain(
llm=llm,
prompt=PromptTemplate.from_template(prompt_template)
)
retriever_infos = [
{"name": "state of the union", "description": "Good for answering questions about the 2023 State of the Union address", "retriever": sou_retriever},
{"name": "pg essay", "description": "Good for answering questions about Paul Graham's essay", "retriever": pg_retriever},
{"name": "personal", "description": "Good for answering questions about me", "retriever": personal_retriever},
]
chain = MultiRetrievalQAChain.from_retrievers(
llm=llm,
retriever_infos=retriever_infos,
default_chain=default_chain,
verbose=True
)
2.3 运行 QA 查询
print(chain.run("What did the president say about the economy?"))
print(chain.run("What is something Paul Graham regrets about his work?"))
print(chain.run("What is my background?"))
3. 总结
在本篇文章中,我们探讨了 LangChain 提供的记忆机制,包括 ChatMessageHistory
、ConversationBufferMemory
、ConversationBufferWindowMemory
和 ConversationEntityMemory
,并演示了它们的使用场景。同时,我们介绍了 MultiRetrievalQAChain
,展示了如何结合多个检索器提升问答系统的性能。
LangChain 的这些特性为 NLP 应用提供了更强大的能力,能够处理更复杂的对话和问答任务。在下一篇文章中,我们将继续深入探讨 LangChain 的高级功能,敬请期待!
如果你觉得这篇博文对你有帮助,请点赞、收藏、关注我,并且可以打赏支持我!
欢迎关注我的后续博文,我将分享更多关于人工智能、自然语言处理和计算机视觉的精彩内容。
谢谢大家的支持!