本教程下半部分承接上半部分,深入讲解 LangChain v1 的流式输出、中间件系统、多模态支持、动态提示与状态管理等高级能力,并提供生产级部署考量。目标是让开发者能构建出可监控、可调试、可扩展、可产品化的 AI Agent。
七、流式传输:提供实时响应体验
LangChain 提供三种流式模式,适用于不同粒度的实时反馈需求。
1. 代理进度流(stream_mode="updates")
每完成一个图节点(如模型调用、工具执行)就推送一次状态:
from langchain.agents import create_agent
def get_weather(city: str) -> str:
return f"It's always sunny in {city}!"
agent = create_agent(model="openai:gpt-5-nano", tools=[get_weather])
for chunk in agent.stream(
{"messages": [{"role": "user", "content": "What is the weather in SF?"}]},
stream_mode="updates"
):
for step, data in chunk.items():
print(f"[{step}] Latest message: {data['messages'][-1].content}")
适用场景:前端 UI 分步展示“正在思考” → “正在调用天气 API” → “生成最终回答”。
2. LLM Token 流(stream_mode="messages")
逐 token 输出模型生成内容(包括工具调用参数):
for token, metadata in agent.stream(
{"messages": [{"role": "user", "content": "天气如何?"}]},
stream_mode="messages"
):
node = metadata["langgraph_node"]
if node == "model":
for block in token.content_blocks:
if block["type"] == "text":
print(block["text"], end="", flush=True)
适用场景:实现“打字机”效果,提升用户等待体验。
3. 自定义流(stream_mode="custom")
在工具内部使用 get_stream_writer() 发送任意进度:
from langgraph.config import get_stream_writer
@tool
def fetch_large_dataset(query: str) -> str:
writer = get_stream_writer()
writer("正在连接数据库...")
time.sleep(1)
writer("正在执行查询...")
time.sleep(2)
return "查询结果:[1000 条记录]"
for chunk in agent.stream(
{"messages": [{"role": "user", "content": "查销售数据"}]},
stream_mode="custom"
):
print(f"[工具进度] {chunk}")
✅ 注意:
get_stream_writer()仅在 LangGraph 执行上下文中有效。
4. 混合流式(多模式并行)
for stream_mode, chunk in agent.stream(
{"messages": [{"role": "user", "content": "查天气"}]},
stream_mode=["updates", "custom"]
):
if stream_mode == "updates":
print(f"✅ 步骤完成: {list(chunk.keys())[0]}")
elif stream_mode == "custom":
print(f"🔄 工具进度: {chunk}")
5. 代码示例 (以硅基流动API为例子)
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
if __name__ == "__main__":
open_model = init_chat_model(
model = "THUDM/GLM-Z1-9B-0414",
model_provider = "openai",
api_key = "sk-****************************",
base_url = "https://api.siliconflow.cn/v1/",
)
agent = create_agent(
model=open_model,
system_prompt = """你是一名人工助手,专门帮助用户完成任务与处理问题""",
)
for token, metadata in agent.stream(
{"messages": [{"role": "user", "content": "如何创建一个txt文件"}]},
stream_mode="messages"
):
node = metadata["langgraph_node"]
if node == "model":
for block in token.content_blocks:
if block["type"] == "text":
print(block["text"], end="", flush=True)
八、中间件:精细控制 Agent 执行流程
中间件是 LangChain v1 的核心扩展机制,在 Agent 执行图的关键节点插入钩子。
1. 预置中间件(Built-in Middleware)
| 中间件 | 用途 | 关键配置 |
|---|---|---|
SummarizationMiddleware | 自动摘要长对话 | max_tokens_before_summary |
HumanInTheLoopMiddleware | 高危操作人工审批 | interrupt_on |
ModelCallLimitMiddleware | 防止无限循环 | run_limit, thread_limit |
ToolCallLimitMiddleware | 限制工具调用频次 | 支持按工具名限流 |
ToolRetryMiddleware | 自动重试失败工具 | 指数退避 + jitter |
PIIMiddleware | 敏感信息检测与脱敏 | strategy=redact/mask/block |
LLMToolEmulator | 测试时模拟工具 | 不调用真实 API |
ContextEditingMiddleware | 清理失败的工具调用记录 | ClearToolUsesEdit |
示例:人在回路(Human-in-the-loop)
from langgraph.checkpoint.memory import InMemorySaver
agent = create_agent(
model="openai:gpt-4o",
tools=[send_email, delete_user],
checkpointer=InMemorySaver(),
middleware=[
HumanInTheLoopMiddleware(
interrupt_on={
"send_email": {"allowed_decisions": ["approve", "edit", "reject"]},
"delete_user": True, # 强制审批
"read_data": False # 自动放行
}
)
]
)
# 调用后,Agent 会暂停并返回中断状态
result = agent.invoke({"messages": [{"role": "user", "content": "删除用户 ID 123"}]})
if result.get("interrupt"):
print("等待人工审批...")
# 人工操作后继续
agent.update_state({"decision": "approve"})
result = agent.invoke(None, {"configurable": {"thread_id": "1"}})
⚠️ 要求:
HumanInTheLoopMiddleware必须配合checkpointer使用。
2. 自定义中间件(两种方式)
(1) 装饰器方式(简单单钩子)
from langchain.agents.middleware import before_model, after_model
@before_model
def log_token_usage(state, runtime):
print(f"当前消息数: {len(state['messages'])}")
@after_model
def block_sensitive_content(state, runtime):
if "密码" in state["messages"][-1].content:
return {"messages": [AIMessage("该请求涉及敏感内容,已拒绝。")], "jump_to": "end"}
agent = create_agent(model=..., middleware=[log_token_usage, block_sensitive_content])
(2) 类方式(复杂多钩子)
class RateLimitMiddleware(AgentMiddleware):
def __init__(self, max_calls_per_min=10):
self.max_calls = max_calls_per_min
self.call_timestamps = []
def before_model(self, state, runtime):
now = time.time()
self.call_timestamps = [t for t in self.call_timestamps if now - t < 60]
if len(self.call_timestamps) >= self.max_calls:
return {"messages": [AIMessage("请求过于频繁,请稍后再试。")], "jump_to": "end"}
self.call_timestamps.append(now)
return None
3. 代码示例 (以硅基流动API为例子)
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import tool
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents.middleware import SummarizationMiddleware, HumanInTheLoopMiddleware
@tool
def send_email(to: str, subject: str, body: str) -> str:
"""Sends an email."""
return f"Email sent to {to} with subject '{subject}'."
@tool
def delete_user(user_id: str) -> str:
"""Deletes a user by user ID."""
return f"User with ID {user_id} has been deleted."
if __name__ == "__main__":
open_model = init_chat_model(
model = "THUDM/GLM-Z1-9B-0414",
model_provider = "openai",
api_key = "sk-******************************",
base_url = "https://api.siliconflow.cn/v1/",
)
agent = create_agent(
model=open_model,
tools=[send_email, delete_user],
checkpointer=InMemorySaver(),
middleware=[
HumanInTheLoopMiddleware(
interrupt_on={
"send_email": {"allowed_decisions": ["approve", "edit", "reject"]},
"delete_user": True, # 强制审批
"read_data": False # 自动放行
}
)
]
)
# 调用后,Agent 会暂停并返回中断状态
result = agent.invoke({"messages": [{"role": "user", "content": "删除用户 ID 123"}]}, {"configurable": {"thread_id": "1"}})
print(result)
if result.get("__interrupt__"):
print("等待人工审批...")
# 人工操作后继续
agent.update_state({"configurable": {"thread_id": "1"}}, {"decision": "approve"})
result = agent.invoke(None, {"configurable": {"thread_id": "1"}})
print(result)
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from langchain.tools import tool
from langchain.messages import AIMessage
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents.middleware import SummarizationMiddleware, HumanInTheLoopMiddleware, before_model, after_model
@tool
def send_email(to: str, subject: str, body: str) -> str:
"""Sends an email."""
return f"Email sent to {to} with subject '{subject}'."
@tool
def delete_user(user_id: str) -> str:
"""Deletes a user by user ID."""
return f"User with ID {user_id} has been deleted."
@before_model
def log_token_usage(state, runtime):
print(f"当前消息数: {len(state['messages'])}")
@after_model
def block_sensitive_content(state, runtime):
if "密码" in state["messages"][-1].content:
return {"messages": [AIMessage("该请求涉及敏感内容,已拒绝。")], "jump_to": "end"}
if __name__ == "__main__":
open_model = init_chat_model(
model = "THUDM/GLM-Z1-9B-0414",
model_provider = "openai",
api_key = "sk-*******************************",
base_url = "https://api.siliconflow.cn/v1/",
)
agent = create_agent(
model=open_model,
tools=[send_email, delete_user],
checkpointer=InMemorySaver(),
middleware=[log_token_usage, block_sensitive_content]
)
# 调用后,Agent 会暂停并返回中断状态
result = agent.invoke({"messages": [{"role": "user", "content": "重新设置用户章三的密码"}]}, {"configurable": {"thread_id": "1"}})
print(result)
九、多模态与结构化内容处理
LangChain v1 统一了多模态输入的表示:content_blocks。
1. 多模态输入(文本 + 图像 + PDF)
from langchain.messages import HumanMessage
# 图像 + 文本
msg = HumanMessage(content=[
{"type": "text", "text": "描述这张图片"},
{"type": "image", "url": "https://example.com/chart.png"}
])
# PDF 文档
msg = HumanMessage(content=[
{"type": "text", "text": "总结这份财报"},
{"type": "file", "url": "https://example.com/report.pdf", "mime_type": "application/pdf"}
])
response = model.invoke([msg])
2. 标准化内容块(content_blocks)
无论使用 OpenAI 还是 Anthropic,均可通过 content_blocks 获取统一格式:
ai_msg = model.invoke("分析以下数据...")
for block in ai_msg.content_blocks:
if block["type"] == "reasoning":
print("🧠 推理步骤:", block["reasoning"])
elif block["type"] == "text":
print("💬 回答:", block["text"])
3. 代码示例 (以硅基流动API为例子)
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
if __name__ == "__main__":
open_model = init_chat_model(
model = "THUDM/GLM-4.1V-9B-Thinking",
model_provider = "openai",
api_key = "sk-*******************************",
base_url = "https://api.siliconflow.cn/v1/",
)
message = {
"role": "user",
"content": [
{"type": "text", "text": "描述此图像的内容。"},
{"type": "image", "url": "https://tse3-mm.cn.bing.net/th/id/OIP-C.RQjcQ93pP-aq4QlG2oKMvgHaFm?w=89&h=89&c=1&rs=1&qlt=70&r=0&o=7&cb=ucfimg2&dpr=2&pid=InlineBlock&rm=3&ucfimg=1"},
]
}
response = open_model.invoke([message])
print(response)
✅ 优势:避免被不同 LLM 的原生格式绑架,实现跨模型兼容。
十、动态提示
1. 动态系统提示(基于用户角色)
from typing import TypedDict
from langchain.agents import create_agent
from langchain.agents.middleware import dynamic_prompt, ModelRequest
class Context(TypedDict):
user_role: str
@dynamic_prompt
def user_role_prompt(request: ModelRequest) -> str:
"""根据用户角色生成系统提示。"""
user_role = request.runtime.context.get("user_role", "user")
base_prompt = "你是一个有帮助的助手。"
if user_role == "expert":
return f"{base_prompt} 提供详细的技术响应。"
elif user_role == "beginner":
return f"{base_prompt} 简单解释概念,避免使用行话。"
return base_prompt
agent = create_agent(
model="openai:gpt-4o",
tools=[web_search],
middleware=[user_role_prompt],
context_schema=Context
)
# 系统提示将根据上下文动态设置
result = agent.invoke(
{"messages": [{"role": "user", "content": "解释机器学习"}]},
context={"user_role": "expert"}
)
代码示例 (以硅基流动API为例子)
from langchain.agents import create_agent
from langchain.chat_models import init_chat_model
from typing import TypedDict
from langchain.agents.middleware import dynamic_prompt, ModelRequest
class Context(TypedDict):
user_role: str
@dynamic_prompt
def user_role_prompt(request: ModelRequest) -> str:
"""根据用户角色生成系统提示。"""
user_role = request.runtime.context.get("user_role", "user")
base_prompt = "你是一个有帮助的助手。"
if user_role == "expert":
return f"{base_prompt} 提供详细的技术响应。"
elif user_role == "beginner":
return f"{base_prompt} 简单解释概念,避免使用行话。"
return base_prompt
if __name__ == "__main__":
open_model = init_chat_model(
model = "THUDM/GLM-Z1-9B-0414",
model_provider = "openai",
api_key = "sk-*******************************",
base_url = "https://api.siliconflow.cn/v1/",
)
agent = create_agent(
model=open_model,
middleware=[user_role_prompt],
context_schema=Context
)
# 系统提示将根据上下文动态设置
result = agent.invoke(
{"messages": [{"role": "user", "content": "解释机器学习"}]},
context={"user_role": "expert"}
)
print(result)
✅ 最佳实践:不要盲目删除最早消息,保留关键身份信息。
十一、总结:LangChain v1 Agent 架构全景
LangChain v1 的 Agent 是一个有状态、可中断、可监控、可扩展的执行图:
[用户输入]
↓
[System Prompt + Context] → (动态生成)
↓
[Before Model Middleware] → (修剪/注入/验证)
↓
[LLM 调用] → (支持流式、结构化输出)
↓
[After Model Middleware] → (过滤/脱敏/摘要)
↓
[工具调用] → (重试/限流/模拟/人工审批)
↓
[更新状态] → (持久化到 Checkpointer)
↓
[循环或结束]
通过组合 模型 + 工具 + 中间件 + Checkpointer + 流式接口,可以构建出满足企业级需求的 AI Agent。
7658

被折叠的 条评论
为什么被折叠?



