项目的接口模型是GPT-3.5
1 接入大模型
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
openai_api_base="xxxxxxxxxxxxxxxx",
openai_api_key="sk-xxxxxxxxxxxxxxxxxxxxxxxxxxx",
)
可以看到,模型对打招呼做出返回:
from langchain_core.messages import HumanMessage
llm.invoke([HumanMessage(content="Hi! I'm Bob")])
AIMessage(content=‘Hello, Bob! Nice to meet you. How can I assist you today?’, response_metadata={‘token_usage’: {‘completion_tokens’: 0, ‘prompt_tokens’: 0, ‘total_tokens’: 0}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-3f0b67d0-0900-427c-9eda-9e405893be60-0’)
但是此时再问人类的姓名,AI是不知道的,因为默认没有记忆功能
llm.invoke([HumanMessage(content="What's my name?")])
AIMessage(content=“I’m sorry, I don’t have access to personal information like your name unless you tell me. How can I assist you today?”, response_metadata={‘token_usage’: {‘completion_tokens’: 0, ‘prompt_tokens’: 0, ‘total_tokens’: 0}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-0a75ae65-4202-4a69-bd67-859aa4be676c-0’)
此处还要注意另一点,invoke方法这里调用的参数是input,其接受类型为Union[PromptValue, str, Sequence[MessageLikeRepresentation]],也就是说PromptValue、字符串、消息列表都可以。上面实际上接受的是消息列表,下面尝试接受另两个类型:
2 多种输入类型作为prompt调用大模型
llm.invoke("Hello? Who are you?") # 此处是字符串类型调用
AIMessage(content=“Hello! I’m ChatGPT, an AI language model. I’m here to help answer questions, have conversations, or assist with whatever you need. How can I help you today?”, response_metadata={‘token_usage’: {‘completion_tokens’: 0, ‘prompt_tokens’: 0, ‘total_tokens’: 0}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-035271f6-6633-493e-ba27-ec1707c188b2-0’)
from langchain_core.prompts import PromptTemplate
# 此处是PromptValue类型调用,其中PromptValue本身是抽象类,StringPromptValue是字符串类型prompt的对其实现
str = "Can you translate {content} to Chinese?"
promptTemplate = PromptTemplate.from_template(str)
prompt = promptTemplate.format(content = "pierce")
print(promptTemplate, prompt)
llm.invoke(prompt)
input_variables=[‘content’] template=‘Can you translate {content} to Chinese?’ Can you translate pierce to Chinese?
AIMessage(content=‘The word “pierce” can be translated to Chinese as 刺穿 (cì chuān) or 穿透 (chuān tòu), depending on the context in which it’s used.’, response_metadata={‘token_usage’: {‘completion_tokens’: 0, ‘prompt_tokens’: 0, ‘total_tokens’: 0}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-fb10b27a-4491-443f-a233-ea48910ad089-0’)
回到消息历史方面,我们可以将历史对话传给模型,让其看起来“有记忆功能”:
3 为其添加记忆功能
我们以消息列表的形式将历史记录传递给模型,从而实现记忆功能:
from langchain_core.messages import AIMessage
llm.invoke(
[
HumanMessage(content="Hi! I'm Pierce"),
AIMessage(content="Hello Pierce! How can I assist you today?"),
HumanMessage(content="What's my name?"),
]
)
AIMessage(content=‘Your name is Pierce!’, response_metadata={‘token_usage’: {‘completion_tokens’: 0, ‘prompt_tokens’: 0, ‘total_tokens’: 0}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-286ba0d3-a673-4cb1-9522-a03638c97c0c-0’)
这样显示地将对话历史作为消息列表prompt传给模型有点繁琐,合适的方法是这样的:
from langchain_core.chat_history import InMemoryChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
# store 存储所有对话历史,作为memory
store = {}
def get_session_history(session_id: str) -> InMemoryChatMessageHistory:
if session_id not in store:
store[session_id] = InMemoryChatMessageHistory()
return store[session_id]
with_message_history = RunnableWithMessageHistory(llm, get_session_history)
这里,BaseChatMessageHistory作为基类是一个抽象类,要实现任何一个对话历史存储,都要继承自该抽象类,并实现下列方法中的一个或多个:
- add_messages: sync variant for bulk addition of messages
- aadd_messages: async variant for bulk addition of messages
- messages: sync variant for getting messages
- aget_messages: async variant for getting messages
- clear: sync variant for clearing messages
- aclear: async variant for clearing messages
InMemoryChatMessageHistory()方法继承自抽象类并实现了上述方法
这里get_session_history()方法确定如何获取需要的历史信息,session_id的设置用于区分不同的对话信息对于RunnableWithMessageHistory,其本质是一个Runnable,用于管理另一个Runnable的聊天消息历史记录。
RunnableWithMessageHistory包装另一个Runnable并为其管理聊天消息历史;它负责读取和更新聊天消息历史记录
调用时,必须始终使用包含聊天消息历史工厂的适当参数的配置来调用RunnableWithMessageHistory
默认情况下,Runnable需要一个名为“session_id”的配置参数,该参数是一个字符串。该参数用于创建新的或查找与给定session_id匹配的现有聊天消息历史记录
config = {"configurable": {"session_id": "test001"}}
response = with_message_history.invoke(
[HumanMessage(content="Hi! I'm Pierce")],
config=config,
)
response.content
Parent run 7c9e16c1-5f7f-44ef-97fc-98c97d2a4de2 not found for run e67007cd-0748-420b-8b16-44641b397ed8. Treating as a root run.
‘Hello Pierce! How can I assist you today?’
response = with_message_history.invoke(
[HumanMessage(content="What's my name?")],
config=config,
)
response.content
Parent run 4cb949d7-fc74-40f7-b7cf-a409637f761e not found for run bf0dba2c-930d-4166-b22b-5bd068fb8998. Treating as a root run.
‘Your name is Pierce.’
可以看到,现在模型具有了真正的记忆能力。如果我们修改session_id,即置换到另一个对话的记忆,模型对该次对话的记忆就消失:
config02 = {"configurable": {"session_id": "test002"}}
response = with_message_history.invoke(
[HumanMessage(content="What's my name?")],
config=config02,
)
response.content
Parent run f6da62fa-3db5-4e11-a7a4-819e87b9fc15 not found for run be939cc3-29f3-4d9a-bdb6-21bed23cbd4d. Treating as a root run.
“I’m sorry, but I do not have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality.”
只要我们切换回来,就又可以实现记忆了:
config = {"configurable": {"session_id": "test001"}}
response = with_message_history.invoke(
[HumanMessage(content="What's my name?")],
config=config,
)
response.content
Parent run beb8df26-8f44-4e21-8982-9f321993caec not found for run 59eb38b9-d329-44eb-9a4e-232e7ff8b997. Treating as a root run.
‘Your name is Pierce.’
因此,将记忆存储在磁盘内做持久化处理也就是可行的了。让我们看看store里存储的信息:
{‘test001’: InMemoryChatMessageHistory(messages=[HumanMessage(content=“Hi! I’m Pierce”), AIMessage(content=‘Hello Pierce! How can I assist you today?’, response_metadata={‘token_usage’: {‘completion_tokens’: 10, ‘prompt_tokens’: 12, ‘total_tokens’: 22}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-e67007cd-0748-420b-8b16-44641b397ed8-0’), HumanMessage(content=“What’s my name?”), AIMessage(content=‘Your name is Pierce.’, response_metadata={‘token_usage’: {‘completion_tokens’: 5, ‘prompt_tokens’: 35, ‘total_tokens’: 40}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-bf0dba2c-930d-4166-b22b-5bd068fb8998-0’), HumanMessage(content=“What’s my name?”), AIMessage(content=‘Your name is Pierce.’, response_metadata={‘token_usage’: {‘completion_tokens’: 5, ‘prompt_tokens’: 53, ‘total_tokens’: 58}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-59eb38b9-d329-44eb-9a4e-232e7ff8b997-0’)]),
‘test002’: InMemoryChatMessageHistory(messages=[HumanMessage(content=“What’s my name?”), AIMessage(content=“I’m sorry, but I do not have access to personal information about individuals unless it has been shared with me in the course of our conversation. I am designed to respect user privacy and confidentiality.”, response_metadata={‘token_usage’: {‘completion_tokens’: 39, ‘prompt_tokens’: 12, ‘total_tokens’: 51}, ‘model_name’: ‘gpt-3.5-turbo’, ‘system_fingerprint’: None, ‘finish_reason’: ‘stop’, ‘logprobs’: None}, id=‘run-be939cc3-29f3-4d9a-bdb6-21bed23cbd4d-0’)])}
应该注意到,我们每次返回的AIMessage中现实的prompt_tokens都在增加,可见是将对话历史作为prompt的一部分传递给模型的。
那么会遇到另一个问题,如果一直保持着对话历史的全部内容,野蛮增长下很快对话历史就会占满模型的全部token。因此,我们需要有能力对对话历史做出掌控,例如根据时间顺序截取、根据对话提到的频率截取、对对话历史做总结等
4 进一步操控对话历史
我们模拟一个上述场景,将token数量限制为65,因而历史信息显示不全:
from langchain_core.messages import SystemMessage, trim_messages, AIMessage
trimmer = trim_messages(
max_tokens=65,
strategy="last",
token_counter=llm,
include_system=True,
allow_partial=False,
start_on="human",
)
messages = [
SystemMessage(content="you're a good assistant"),
HumanMessage(content="hi! I'm bob"),
AIMessage(content="hi!"),
HumanMessage(content="I like vanilla ice cream"),
AIMessage(content="nice"),
HumanMessage(content="whats 2 + 2"),
AIMessage(content="4"),
HumanMessage(content="thanks"),
AIMessage(content="no problem!"),
HumanMessage(content="having fun?"),
AIMessage(content="yes!"),
]
trimmer.invoke(messages)
[SystemMessage(content=“you’re a good assistant”),
HumanMessage(content=‘whats 2 + 2’),
AIMessage(content=‘4’),
HumanMessage(content=‘thanks’),
AIMessage(content=‘no problem!’),
HumanMessage(content=‘having fun?’),
AIMessage(content=‘yes!’)]
这里就是我们之前提到的按时间顺序截取了,其中设置了策略为“last”;此外,设置了开头的系统信息不被丢弃,且保存的对话总是以人类信息开始。这时我们分别问2+2之前的问题和之后的问题,可看到区别:
from operator import itemgetter
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.runnables import RunnablePassthrough
prompt = ChatPromptTemplate.from_messages(
[
(
"system",
"You are a helpful assistant. Answer all questions to the best of your ability in {language}.",
),
MessagesPlaceholder(variable_name="messages"),
]
)
chain = (
RunnablePassthrough.assign(messages=itemgetter("messages") | trimmer)
| prompt
| llm
)
response = chain.invoke(
{
"messages": messages + [HumanMessage(content="what's my name?")],
"language": "English",
}
)
response.content
“I’m sorry, but I don’t have access to personal information, so I don’t know your name.”
response = chain.invoke(
{
"messages": messages + [HumanMessage(content="what math problem did i ask")],
"language": "English",
}
)
response.content
‘You asked what is 2 + 2.’
可以看到,之前抛弃掉的信息模型不能再看到,但尚未抛弃的是可以看到的
with_message_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="messages",
)
config = {"configurable": {"session_id": "newHistory"}}
response = with_message_history.invoke(
{
"messages": messages + [HumanMessage(content="whats my name?")],
"language": "English",
},
config=config,
)
response.content
Parent run 87a69d25-9527-4b21-abf0-03f319245768 not found for run 96ba26e2-7c6c-4906-88da-84ecfb3b1665. Treating as a root run.
“I’m sorry, but I don’t have access to personal information.”
我们当然知道模型已经无法得知我们的名字这一事实了,但不要忘记询问名字这一轮对话现在也加入了内存,顶掉了最前面关于2+2的对话记忆!
response = with_message_history.invoke(
{
"messages": [HumanMessage(content="what math problem did i ask?")],
"language": "English",
},
config=config,
)
response.content
Parent run 88f791be-b831-4849-b951-d41d23e573eb not found for run 357a5dcd-17be-44bf-aa73-39e94e414db7. Treating as a root run.
“I’m sorry, but I don’t have access to that information. Please let me know what math problem you would like assistance with and I’ll do my best to help you.”
5 流式输出
最后,模拟网页版ChatGPT回答问题时的流式输出样式:
config = {"configurable": {"session_id": "streaming"}}
for r in with_message_history.stream(
{
"messages": [HumanMessage(content="hi! I'm todd. tell me a joke")],
"language": "English",
},
config=config,
):
print(r.content, end="|")
Parent run b95fdd59-d985-4944-bdec-ba12f5a76005 not found for run 5c7c9ba4-3de8-450d-8d2d-0f8c3e7b5068. Treating as a root run.
|Hi| Todd|!| Sure|,| here|'s| a| joke| for| you|:
|Why| don|'t| skeletons| fight| each| other|?
|They| don|'t| have| the| guts|!||
所有chain都暴露了.stream
方法,而使用消息历史记录的chain也不例外。可以简单地使用该方法来获取流式响应。