AI Agent多轮对话记忆难题如何破解？7种关键优化方法深度解析-优快云博客

在基于大模型的 Agent 中，长期记忆的状态维护至关重要，在 OpenAIAI 应用研究主管 Lilian Weng 的博客《基于大模型的 Agent 构成》中，将记忆视为关键的组件之一，下面我将结合 LangChain 中的代码，分享7 种不同的Agent记忆维护方式在不同场景中的应用。

获取全量历史对话

在电信公司的客服聊天机器人场景中，如果用户在对话中先是询问了账单问题，接着又谈到了网络连接问题，ConversationBufferMemory 可以用来记住整个与用户的对话历史，可以帮助 AI 在回答网络问题时还记得账单问题的相关细节，从而提供更连贯的服务。

from langchain.memory import ConversationBufferMemory
memory = ConversationBufferMemory()
memory.save_context({"input": "你好"}, {"output": "怎么了"})
print(memory.load_memory_variables({}))

滑动窗口获取最近部分对话内容

在一个电商平台上，如果用户询问关于特定产品的问题（如手机的电池续航时间），然后又问到了配送方式，ConversationBufferWindowMemory 可以帮助AI 只专注于最近的一两个问题（如配送方式），而不是整个对话历史，以提供更快速和专注的答复。

from langchain.memory import ConversationBufferWindowMemory
memory = ConversationBufferWindowMemory(k=1)
memory.save_context({"input": "iphone15续航"}, {"output": "续航一般"})
memory.save_context({"input": "配送"}, {"output": "很快"})
# {'history': 'Human: 配送\nAI: 很快'}
print(memory.load_memory_variables({}))

ConversationBufferWindowMemory 这个类在存储message还是全量存储的，只是在读数据的时候只读k个窗口。

获取历史对话中实体信息

在法律咨询的场景中，客户可能会提到特定的案件名称、相关法律条款或个人信息（如“我在去年的交通事故中受了伤，想了解关于赔偿的法律建议”）。 ConversationEntityMemory可以帮助 AI 记住这些关键实体和实体关系细节，从而在整个对话过程中提供更准确、更个性化的法律建议。

llm = ChatOpenAI(temperature=0, model="gpt-4o")


memory = ConversationEntityMemory(
    llm=llm,
    return_messages=True,
)
print(memory.load_memory_variables(inputs={"input": "good!  busy working on Langchain.  lots to do."}))
memory.save_context({"input": "good!  busy working on Langchain.  lots to do."}, {"output": "That sounds like a lot of work!  What kind of things are you doing to make Langchain better?"})
print(memory.load_memory_variables(inputs={"input": "i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ...  a lot of stuff"}))
memory.save_context(inputs={"input": "i'm trying to improve Langchain's interfaces, the UX, its integrations with various products the user might want ...  a lot of stuff"}, outputs={"output": "that sounds great job"})
print(memory.load_memory_variables(inputs={"input": "what is langchain"}))

在会话过程中，需要从memory load 变量时：

根据history和用户的提问(也就是最新一句话)提取实体，注意这里提取的是用户最新提问的query的实体
从entity_store这个大字典查询之前是否存在对应实体的描述信息，如果有对应的描述信息，则把对应的实体和描述信息作为entities字段返回
如果之前提取了实体，但是最新一句话

当一次会话结束之后，需要save_contexts:

保存human message和ai message到 messages列表
因为AI message 可能补充了human 提到的实体信息，所以使用LLM更新当前query提到的实体的描述信息
如果在当前会话之前提取了实体，但是当前会话只是简单的问候，那么就不会更新实体的描述信息，本质还是因为实体信息是绑定在当前的query的

利用知识图谱获取历史对话中的实体及其联系

在医疗咨询中，一个病人可能会描述多个症状和过去的医疗历史（如“我有糖尿病史，最近觉得经常口渴和疲劳”）。 ConversationKGMemory 可以构建一个包含病人症状、疾病历史和可能的健康关联的知识图谱，从而帮助 AI 提供更全面和深入的医疗建议。

from langchain_community.memory.kg import ConversationKGMemory

llm = ChatOpenAI(temperature=0, model="gpt-4o")

memory = ConversationKGMemory(llm=llm)
memory.save_context({"input": "say hi to sam"}, {"output": "who is sam"})
memory.save_context({"input": "sam is a friend"}, {"output": "okay"})
print(memory.load_memory_variables({"input": "who is sam"}))  # {'history': 'On Sam: Sam is a friend.'}
print(memory.get_current_entities("what's Sams favorite color?"))  # ['Sam']

当每次会话结束的时候，会利用LLM从history中抽取知识的三元组，并存储到NetworkxEntityGraph图对象中。

当新的会话开始需要从memory load数据的时候，从当前Query中利用LLM抽取实体，并从NetworkxEntityGraph图对象中获取这个实体的knowledge, 把所有实体的知识信息返回。

对历史对话进行阶段性总结摘要

在一系列的教育辅导对话中，学生可能会提出不同的数学问题或理解难题（如“我不太理解二次方程的求解方法”）。 ConversationSummaryMemory 可以帮助 AI 总结之前的辅导内容和学生的疑问点，以便在随后的辅导中提供更针对性的解释和练习.

llm = ChatOpenAI(temperature=0, model="gpt-4o")
memory = ConversationSummaryMemory(llm=llm)
memory.save_context({"input": "hi"}, {"output": "whats up"})
print(memory.load_memory_variables({}))  # {'history': 'The human greets the AI with "hi," and the AI responds with "what\'s up."'}

ConversationSummaryMemory 有个buffer的属性，存放summary信息。每次会话结束的时候，用新生成的会话和之前的summary生成新的summary存储在buffer属性中。

ConversationSummaryMemory 特点:

只存储摘要，不存储原始对话
每次对话后都会更新摘要
适合长期对话，节省 token
可能丢失细节信息

需要获取最新对话，又要兼顾较早历史对话

在处理一个长期的技术问题时（如软件故障排查），用户可能会在多次对话中提供不同的错误信息和反馈。ConversationSummaryBufferMemory 可以帮助 AI 保留最近几次交互的详细信息，同时提供历史问题处理的摘要，以便于更有效地识别和解决问题。

llm = ChatOpenAI(temperature=0, model="gpt-4o")
memory = ConversationSummaryBufferMemory(llm=llm, max_token_limit=10)
memory.save_context({"input": "hi"}, {"output": "whats up"})
memory.save_context({"input": "not much you"}, {"output": "not much"})
# {'history': 'System: The human greets with "hi." The AI responds with "what\'s up," and the human replies with "not much, you?"\nAI: not much'}
print(memory.load_memory_variables({}))

ConversationSummaryBufferMemory 会暂存不会超过max_token_limit的会话历史，当历史长度超过这个大小的时候，会截断之前的会话历史以使得会话现存的会话长度不超过max_token_limit，并把截断的之前的会话历史和之前的moving_summary_buffer更新moving_summary_buffer信息。

ConversationSummaryBufferMemory 特点:

存储最近的对话 + 早期对话的摘要
结合了完整对话和摘要的优势
保持最近对话的细节，压缩早期对话
适合中等长度的对话

基于向量检索对话信息

用户可能会对特定新闻事件提出问题，如“最近的经济峰会有什么重要决策？ℽ VectorStoreRetrieverMemory 能够快速从大量历史新闻数据中检索出与当前问题最相关的信息，即使这些信息在整个对话历史中不是最新的，也能提供及时准确的背景信息和详细报道。

import faiss

from langchain.docstore import InMemoryDocstore
from langchain.vectorstores import FAISS


embedding_size = 1536 # Dimensions of the OpenAIEmbeddings
index = faiss.IndexFlatL2(embedding_size)
embedding_fn = OpenAIEmbeddings().embed_query
vectorstore = FAISS(embedding_fn, index, InMemoryDocstore({}), {})


# the vector lookup still returns the semantically relevant information
retriever = vectorstore.as_retriever(search_kwargs=dict(k=1))
memory = VectorStoreRetrieverMemory(retriever=retriever)

# When added to an agent, the memory object can save pertinent information from conversations or used tools
memory.save_context({"input": "My favorite food is pizza"}, {"output": "thats good to know"})
memory.save_context({"input": "My favorite sport is soccer"}, {"output": "..."})
memory.save_context({"input": "I don't the Celtics"}, {"output": "ok"})