用 LangGraph 搭建会自我反思的 AI Agent

转载于 2025-10-16 11:20:34 发布 · 146 阅读

CC 4.0 BY-SA版权

原文链接：https://mp.weixin.qq.com/s?__biz=MjM5Njc0MjIwMA==&mid=2649838387&idx=1&sn=54fc580e010e538a6a95567414610882&chksm=bfd05bcc55bb59fa39c294b846c958aed62d58d21a4cf236475f2b053e1a00d113af18426c61&scene=126&sessionid=0

文章标签：

#人工智能

随着大模型能力的不断提升，各行各业对它们的期待也在变化。人们不再满足于模型只给出答案，而是希望它能提供有理有据、可验证的推理过程。

在工程、法律分析、科研、产品开发等专业领域，差不多就行早已不行。一个基于大模型的智能体，能像人类一样对自己的输出进行复查、思考、修订，这已经不是锦上添花，而是变成了基本要求。

问题在于：

大多数基于提示词（prompt-based）的输出是一次性的
它们缺少一个验证循环
没有自我评估机制，就容易出现幻觉、逻辑不一致或语境不匹配的问题

为了解决这些问题，我们将使用 LangGraph 来构建一个基于图结构的智能体，让它能够通过有方向的推理过程实现自我反思、改进并最终完善输出。

什么是 Reflection Agent？

Reflection Agent 并不是一种新的模型，而是一种让推理过程更有结构化循环的架构。你可以把它想象成同时进行的三场对话：

先规划——思考该如何解决问题；
再输出——生成回答；
最后反思——对输出进行批判性审查，并在需要时进行改进。

从设计上看，这个 Agent 就像一个由三个主要步骤组成的有限状态循环：

Plan（计划） → Act（执行） → Reflect（反思）如果需要，还会触发一个条件性的 Retry（重试）。

它最大的优势在于：将生成答案和评估答案拆分为两个独立、连续的步骤，从而主动减少推理错误。

就像写论文一样——先写初稿，再交给编辑修改。

环境设置

我们将通过 LangGraph 使用 LangChain 来调用 OpenAI 的 GPT 模型。

首先安装：

pip install --upgrade langgraph langchain langchain-openai

然后设置你的 API Key，可以通过环境变量方式

export OPENAI_API_KEY="your-key"

或者在脚本中手动配置：

import osos.environ["OPENAI_API_KEY"] = "your-key"

代码讲解：从零开始构建

新建一个文件，命名为 reflection_agent.py

下面是我们一步步搭建逻辑的过程。

定义状态类（State Class）

首先，在 LangGraph 的语境下定义 Agent 的状态结构：

from typing import TypedDict, List, Optionalclass AgentState(TypedDict):    input_question: str    history: List    reflection: Optional[str]    answer: Optional[str]    attempts: int

这个模型用于在图的各个节点之间保存 Agent 的思考和行为决策数据。

初始化模型接口：

from langchain_openai import ChatOpenAIfrom langchain_core.messages import HumanMessage, SystemMessage, AIMessage

然后初始化模型：

llm = ChatOpenAI(model="gpt-4", temperature=0.7)

Plan 节点

Agent 为输入问题制定一个简短的解答策略：

def plan_step(state: AgentState) -> AgentState:    question = state['input_question']    history = state.get('history', [])    plan_prompt = f"You're assigned to plan an explanation for:\n\n{question}\n\nBreak your thoughts into steps or bullets."    messages = history + [HumanMessage(content=plan_prompt)]    response = llm.invoke(messages)    updated_history = history + [        HumanMessage(content=plan_prompt),        AIMessage(content=response.content)    ]    return {        **state,        "history": updated_history    }

在这个阶段，Agent 还没有开始真正解答问题，它只是在思考应该如何去解答。

Act 节点

根据上一步的计划，Agent 开始生成答案：

def act_step(state: AgentState) -> AgentState:    act_prompt = "Use your planning above to produce the best possible answer."    messages = state["history"] + [HumanMessage(content=act_prompt)]    response = llm.invoke(messages)    updated_history = state["history"] + [        HumanMessage(content=act_prompt),        AIMessage(content=response.content)    ]    return {        **state,        "history": updated_history,        "answer": response.content    }

这一步不会对之前的有效性作任何假设，它只是按照计划执行并生成输出。

Reflect 节点

这是最关键的一步：在这里进行反思。答案是否可靠？有没有遗漏？

def reflect_step(state: AgentState) -> AgentState:    question = state["input_question"]    answer = state.get("answer", "")    reflection_prompt = (        f"Review this answer to the following question:\n"        f"Q: {question}\nA: {answer}\n\n"        f"Are there missing considerations, oversimplifications, false claims, or ambiguous logic? If so, describe them."    )    messages = [        SystemMessage(content="You're reviewing for clarity, logic, and completeness."),        HumanMessage(content=reflection_prompt)    ]    response = llm.invoke(messages)    return {        **state,        "reflection": response.content    }

反思本身并不会修正答案——它只是系统地发现问题。

重试触发（路由逻辑）

根据反思结果，我们决定是重试还是结束。

def should_react(state: AgentState) -> str:    reflection = state.get("reflection", "")    attempts = state.get("attempts", 0)    if attempts >= 2:        return "end"    if "no flaws" in reflection.lower() or "satisfactory" in reflection.lower():        return "end"    return "retry"

Retry 节点

在需要时，Agent 会根据反思结果重新调用自身，对答案进行改进：

def retry_step(state: AgentState) -> AgentState:    reflection = state["reflection"]    repair_prompt = f"You reflected that:\n{reflection}\n\nUpdate your answer accordingly."    updated_history = state["history"] + [HumanMessage(content=repair_prompt)]    response = llm.invoke(updated_history)    return {        **state,        "history": updated_history + [AIMessage(content=response.content)],        "answer": response.content,        "attempts": state.get("attempts", 0) + 1    }

组装 LangGraph 工作流

from langgraph.graph import StateGraph, ENDgraph_builder = StateGraph(AgentState)graph_builder.add_node("plan", plan_step)graph_builder.add_node("act", act_step)graph_builder.add_node("reflect", reflect_step)graph_builder.add_node("retry", retry_step)graph_builder.set_entry_point("plan")graph_builder.add_edge("plan", "act")graph_builder.add_edge("act", "reflect")graph_builder.add_conditional_edges("reflect", should_react, {    "end": END,    "retry": "retry"})graph_builder.add_edge("retry", "reflect")graph = graph_builder.compile()

运行代理

if __name__ == "__main__":    entry_question = "Explain the second law of thermodynamics in layman's terms."    state = {        "input_question": entry_question,        "history": [],        "reflection": None,        "answer": None,        "attempts": 0    }    result = graph.invoke(state)    print("Final Answer:\n")    print(result["answer"])    print("\nReflection:\n")    print(result["reflection"])

输出示例（带注释）

典型的结果结构如下：

Final Answer:The second law of thermodynamics says that energy naturally spreads out. For example, if you put a hot spoon in cool water, the heat spreads into the water, and the total disorder (entropy) increases.Reflection:The explanation is mostly solid and simplified correctly. Consider clarifying entropy’s role as a tendency toward statistical variance or chaos. Quality acceptable but improves with an analogy.

可以注意到，它提供了：