Agentic的模式和实现代码-优快云博客

本文链接：https://blog.youkuaiyun.com/Python_cocola/article/details/148592836

1 Prompt Chaining

这个是一个workflow的Agent 模式，一个 LLM 调用的输出依次进入下一个 LLM 调用的输入。这种模式将任务分解为一系列固定的步骤。每一步都由一个 LLM 调用处理前一步LLM处理的输出。这种模式适用于可清晰分解为可预测的顺序子任务的任务。

可以应用的方面：

生成结构化文档： LLM 1 创建大纲，LLM 2 根据标准验证大纲，LLM 3 根据验证后的大纲编写内容。
多步骤数据处理：提取信息、转换信息，然后汇总信息。
根据策划的输入生成新闻简报。

如下是一个基于Ollama、langchain、qwen3的代码实现例子。

from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate

# 初始化 Ollama LLM
llm = OllamaLLM(model="qwen3:8b")  # 你可以换成其他已拉取的模型名

# --- 步骤1：摘要原文 ---
original_text = "Large language models are powerful AI systems trained on vast amounts of text data. They can generate human-like text, translate languages, write different kinds of creative content, and answer your questions in an informative way."
prompt1 = PromptTemplate(
    input_variables=["text"],
    template="Summarize the following text in one sentence: {text}"
)
summary = (prompt1 | llm).invoke({"text": original_text}).strip()
print(f"摘要: {summary}")

# --- 步骤2：翻译摘要 ---
prompt2 = PromptTemplate(
    input_variables=["summary"],
    template="Translate the following summary into French, only return the translation, no other text: {summary}"
)
translation = (prompt2 | llm).invoke({"summary": summary}).strip()
print(f"法语翻译: {translation}")

2 Routing

这个是一个workflow的Agent 模式，有一个 LLM 充当路由器，对用户输入进行分类，并将其导向最合适的专门任务或 LLM。这种模式实现了关注点的分离，可以单独优化各个下游任务（使用专门的提示、不同的模型或特定的工具）。它通过对较简单的任务使用较小的模型来提高效率，并有可能降低成本。当任务被路由时，选定的代理将 “接管 ”完成任务的责任。

使用案例：

客户支持系统：将查询路由到专门负责计费、技术支持或产品信息的代理。
分层使用 LLM：将简单的查询转给速度更快、成本更低的模型（如 Llama 3.1 8B），将复杂或不寻常的问题转给能力更强的模型（如 Gemini 1.5 Pro）。
内容生成：将博客文章、社交媒体更新或广告文案请求路由到不同的专业提示/模型。

如下是一个基于Ollama、langchain、qwen3的代码实现例子。

import os
# import json\\\\
from super_json import SuperJSON
from pydantic import BaseModel
import enum
from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate

llm=OllamaLLM(model="qwen3:8b")
llm_science = OllamaLLM(model="qwen3:4b")
llm_weather=OllamaLLM(model="qwen3:4b")

 
# Define Routing Schema
class Category(enum.Enum):
    WEATHER = "weather"
    SCIENCE = "science"
    UNKNOWN = "unknown"
 
class RoutingDecision(BaseModel):
    category: Category
    reasoning: str
 
# Step 1: Route the Query


prompt_router = PromptTemplate(
    input_variables=["query"],
    template=(
        "/no_think Analyze the user query below and determine its category.\n"
        "Categories:\n"
        "- weather: For questions about weather conditions.\n"
        "- science: For questions about science.\n"
        "- unknown: If the category is unclear.\n\n"
        "Query: {query}\n"
        "Please respond in the following JSON format:\n"
        "{{\"category\": \"weather|science|unknown\", \"reasoning\": \"...\"}}"
    )
)

# user_query = "What's the weather like in Paris?"
user_query = "Explain quantum physics simply."
# user_query = "What is the capital of France?"

response_txt = (prompt_router|llm).invoke({"query":user_query})

response_json=SuperJSON.loads(response_txt)
routing_decision = RoutingDecision(**response_json)

print(routing_decision)
 
# # Step 2: Handoff based on Routing
final_response=""
if routing_decision.category == Category.WEATHER:

    weather_prompt = PromptTemplate(
        input_variables=["query"],
        template=(
            "/no_think Provide a brief weather forecast for the location mentioned in: '{query}'"
        )
    )
    weather_txt=(weather_prompt|llm_weather).invoke({"query":user_query})
    # weather_json = SuperJSON.loads(weather_txt)
    print(weather_txt)
elif routing_decision.category == Category.SCIENCE:

    science_prompt=PromptTemplate(
        input_variables=["query"],
        template=(
            "/no_think\n"
            "Provide the response is real and credible.requestion is {query}"
        )
    )
    science_txt = (science_prompt|llm_science).invoke({"query":user_query})
    print(science_txt)
else:
    unknow_promtp=PromptTemplate(
        input_variables=["query","reasoning"],
        template=(
            "/no_think"
            "The user query is: {query}, but could not be answered. \n"
            "Here is the reasoning: {reasoning}. \n"
            "Write a helpful response to the user for him to try again."
        )
    )
    unknow_txt = (unknow_promtp|llm).invoke({"query":user_query,"reasoning":routing_decision.reasoning})

如上代码中使用到了一个新superjson，代码如下。

import json
import re
from typing import Any, List, Union

class SuperJSON(json.JSONDecoder):
    """
    超级JSON处理类，继承自标准库json模块，扩展loads方法：
    1. 能自动提取并解析字符串中所有合法的json子串。
    2. 能自动补全缺失的右侧大括号。
    3. 其余方法与标准json模块一致。
    """
    @classmethod
    def loads(cls, s: str, *args, **kwargs) -> Union[Any, List[Any]]:
        """
        扩展的loads方法：
        - 自动提取并解析字符串中的所有json子串。
        - 自动补全缺失的右侧大括号。
        - 如果只找到一个json对象，直接返回；多个则返回列表。
        """
        json_objs = []
        # 用栈算法提取最大外层JSON对象
        start = s.find('{')
        while start != -1:
            stack = []
            for i in range(start, len(s)):
                if s[i] == '{':
                    stack.append('{')
                elif s[i] == '}':
                    if stack:
                        stack.pop()
                    if not stack:
                        # 找到配对的最大JSON对象
                        json_str = s[start:i+1]
                        try:
                            obj = json.loads(json_str, *args, **kwargs)
                            json_objs.append(obj)
                        except json.JSONDecodeError:
                            # 尝试补全右侧大括号
                            fixed = cls._fix_braces(json_str)
                            try:
                                obj = json.loads(fixed, *args, **kwargs)
                                json_objs.append(obj)
                            except Exception:
                                pass
                        # 继续查找下一个
                        start = s.find('{', i+1)
                        break
            else:
                # 没有找到配对的右括号
                break
        if not json_objs:
            # 如果没找到json子串，尝试整体修复后解析
            try:
                fixed = cls._fix_braces(s)
                return json.loads(fixed, *args, **kwargs)
            except Exception:
                raise json.JSONDecodeError("无法解析为JSON", s, 0)
        if len(json_objs) == 1:
            return json_objs[0]
        return json_objs

    @staticmethod
    def _fix_braces(s: str) -> str:
        """
        检查并补全右侧大括号
        """
        left = s.count('{')
        right = s.count('}')
        if left > right:
            s = s + ('}' * (left - right))
        return s

    # 其余方法直接继承json模块
    @classmethod
    def dumps(cls, obj, *args, **kwargs):
        return json.dumps(obj, *args, **kwargs)

    @classmethod
    def dump(cls, obj, fp, *args, **kwargs):
        return json.dump(obj, fp, *args, **kwargs)

    @classmethod
    def load(cls, fp, *args, **kwargs):
        return json.load(fp, *args, **kwargs)

3 Parallelization

这个是一个workflow的Agent 模式，一个任务被分解成多个独立的子任务，由多个 LLM 同时处理，并将其输出汇总。这种模式使用了任务并发功能。初始查询（或其部分内容）与单个提示/目标并行发送给多个 LLM。所有分支完成后，它们的单独结果会被收集起来并传递给最后的聚合 LLM，后者会将它们合成为最终响应。如果子任务之间不相互依赖，这就能改善延迟，或通过多数表决或生成不同选项等技术提高质量。

使用案例：

查询分解 RAG：将复杂查询分解为子查询，并行运行每个子查询的检索，然后合成结果。
分析大型文档：将文档分成若干部分，并行汇总每个部分，然后合并汇总结果。
生成多种观点：用不同的角色提示向多个 LLM 提出相同的问题，然后汇总他们的回答。
对数据进行 Map-reduce 式操作。

如下是一个基于Ollama、langchain、qwen3的代码实现例子。

import os
import asyncio
import time
# from google import genai

from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate
 

def get_ollama_response(prompt: str, model: str = "qwen3:8b") -> str:
    """
    同步调用 Ollama 本地模型生成内容。
    """
    llm = OllamaLLM(model=model)
    prompt_template = PromptTemplate(
        input_variables=["prompt"],
        template="{prompt}"
    )
    response = (prompt_template | llm).invoke({"prompt": prompt})
    return response.strip()

async def generate_content(prompt: str, model: str = "qwen3:8b") -> str:
    """
    用 asyncio.to_thread 将同步推理包装为异步。
    """
    return await asyncio.to_thread(get_ollama_response, prompt, model)

async def parallel_tasks(model: str = "qwen3:8b") -> str:
    """
    并发执行多个推理任务，并聚合结果。
    """
    llm = OllamaLLM(model=model)
    topic = "/no_think a friendly robot exploring a jungle"

    prompts = [
        f"Write a short, adventurous story idea about {topic}.",
        f"Write a short, funny story idea about {topic}.",
        f"Write a short, mysterious story idea about {topic}."
    ]
    # 并发执行所有推理任务
    start_time = time.time()
    tasks = [generate_content(prompt, model) for prompt in prompts]
    results = await asyncio.gather(*tasks)
    end_time = time.time()
    print(f"Time taken: {end_time - start_time} seconds")

    print("\n--- Individual Results ---")
    for i, result in enumerate(results):
        print(f"Result {i+1}: {result}\n")

    # 聚合结果
    story_ideas = '\n'.join([f"Idea {i+1}: {result}" for i, result in enumerate(results)])
    aggregation_prompt = PromptTemplate(
        input_variables=['story_ideas'],
        template="Combine the following three story ideas into a single, cohesive summary paragraph:{story_ideas}"
    )
    aggregation_response = (aggregation_prompt | llm).invoke({"story_ideas": story_ideas})
    return aggregation_response

if __name__ == "__main__":
    result = asyncio.run(parallel_tasks())
    print(f"\n--- Aggregated Summary ---\n{result}")

4 Reflection

这是一个agent的模式，agent会对自己的输出进行评估，并利用反馈不断改进自己的响应。这种模式也被称为 “Evaluator-Optimizer”，并使用自我修正循环。初始 LLM 生成一个响应或完成一项任务。然后，第二个 LLM 步骤（甚至是具有不同提示的同一 LLM）充当反思者或评估者，根据要求或期望质量对初始输出进行批判。这种批评（反馈）会被反馈回去，促使 LLM 产生改进后的输出。如此循环往复，直到评估者确认要求得到满足或实现了令人满意的输出。

使用案例：

代码生成：编写代码、执行代码、使用错误信息或测试结果作为反馈来修复错误。
编写和完善：生成草稿，反思其清晰度和语气，然后进行修改。
解决复杂问题：制定计划，评估其可行性，并根据评估结果加以完善。
信息检索：搜索信息，并在提交答案前使用评估工具 LLM 检查是否找到了所需的所有细节。

如下是一个基于Ollama、langchain、qwen3的代码实现例子。

import os
import json
from super_json import SuperJSON
from pydantic import BaseModel
import enum
from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate
 
# Configure the client (ensure GEMINI_API_KEY is set in your environment)
# client = genai.Client(api_key=os.environ["GEMINI_API_KEY"])
 
class EvaluationStatus(enum.Enum):
    PASS = "PASS"
    FAIL = "FAIL"
 
class Evaluation(BaseModel):
    evaluation: EvaluationStatus
    feedback: str
    reasoning: str
 
# --- Initial Generation Function ---
def generate_poem(topic, feedback=None, model="qwen3:4b"):
    # 构造 prompt
    prompt = f"/no_think Write a short, four-line poem about {topic}."
    if feedback:
        prompt += f"\nIncorporate this feedback: {feedback}"
    # 用 LangChain + Ollama 生成诗歌
    llm = OllamaLLM(model=model)
    prompt_template = PromptTemplate(
        input_variables=["prompt"],
        template="{prompt}"
    )
    poem = (prompt_template | llm).invoke({"prompt": prompt}).strip()
    print(f"Generated Poem:\n{poem}")
    return poem
 
# --- Evaluation Function ---
def evaluate(poem: str) -> Evaluation:
    print("\n--- Evaluating Poem ---")
    prompt_critique=PromptTemplate(
        input_variables=["poem"],
        template=(
            "/no_think\n"
            "Critique the following poem. Does it rhyme well? Is it exactly four lines? "
            "Is it creative? Respond with PASS or FAIL and provide feedback."
            "\n"
            "Poem:\n"
            "{poem}"
            "Please respond in the following JSON format:\n"
            "{{\"evaluation\": \"PASS|FAILE\", \"feedback\": \"...\",\"reasoning\": \"...\"}}"
        )
    )

    llm=OllamaLLM(model="qwen3:4b")
    response_critique_txt=(prompt_critique|llm).invoke({"poem":poem})


    response_critique_json=SuperJSON.loads(response_critique_txt)
    critique = Evaluation(**response_critique_json)
    print(f"Evaluation Status: {critique.evaluation}")
    print(f"Evaluation Feedback: {critique.feedback}")
    return critique
 
# Reflection Loop   
max_iterations = 3
current_iteration = 0
topic = "a robot learning to paint"
 
# simulated poem which will not pass the evaluation
current_poem = "With circuits humming, cold and bright,\nA metal hand now holds a brush"
 
while current_iteration < max_iterations:
    current_iteration += 1
    print(f"\n--- Iteration {current_iteration} ---")
    evaluation_result = evaluate(current_poem)
 
    if evaluation_result.evaluation == EvaluationStatus.PASS:
        print("\nFinal Poem:")
        print(current_poem)
        break
    else:
        current_poem = generate_poem(topic, feedback=evaluation_result.feedback)
        if current_iteration == max_iterations:
            print("\nMax iterations reached. Last attempt:")
            print(current_poem)

5 Planning Pattern 也叫Orchestrator-Workers

这个是一个Agent 模式，也叫Orchestrator-Workers。负责规划的 LLM 会将复杂的任务分解成一个动态的子任务列表，然后委托给专门的工作agents（通常使用工具使用）来执行。这种模式试图通过创建初始计划来解决需要多步骤推理的复杂问题。该计划根据用户输入动态生成。然后将子任务分配给 “工人 ”agents，由他们来执行，如果依赖关系允许，可以并行执行。一个 “协调器 ”或 “合成器 ”LLM 会收集来自 “工人 ”的结果，反思总体目标是否已经实现，然后合成最终输出，或在必要时启动重新规划步骤。这就减少了任何一次 LLM 调用的认知负荷，提高了推理质量，最大限度地减少了错误，并允许对工作流程进行动态调整。与路由的主要区别在于，规划器生成的是多步骤计划，而不是选择单一的下一步。

使用案例：

复杂的软件开发任务：将 “构建功能 ”分解为计划、编码、测试和文档子任务。
研究和报告生成：规划文献搜索、数据提取、分析和报告撰写等步骤。
多模式任务：涉及图像生成、文本分析和数据整合的规划步骤。
执行复杂的用户请求，如 “计划一次为期 3 天的巴黎之旅，在我的预算范围内预订机票和酒店”。
如下是一个基于Ollama、langchain、qwen3的代码实现例子。

import os
# from google import genai
from pydantic import BaseModel, Field
from typing import List
from langchain_ollama import OllamaLLM
from langchain.prompts import PromptTemplate
from super_json import SuperJSON

# Define the Plan Schema
class Task(BaseModel):
    task_id: int
    description: str
    assigned_to: str = Field(description="Which worker type should handle this? E.g., Researcher, Writer, Coder")

class Plan(BaseModel):
    goal: str
    steps: List[Task]

# Step 1: Generate the Plan (Planner LLM)
user_goal = "Write a short blog post about the benefits of AI agents."

prompt_planner = PromptTemplate(
    input_variables=["goal"],
    template=(
        "/no_think"
        "Create a step-by-step plan to achieve the following goal.\n"
        "Assign each step to a hypothetical worker type (Researcher, Writer).\n\n"
        "Goal: {goal}\n"
        "Please respond in the following JSON format:\n"
        "{{\"goal\": \"...\", \"steps\": [{{\"task_id\": 0, \"description\": \"...\", \"assigned_to\": \"...\"}}, ...]}}"
    )
)

print(f"Goal: {user_goal}")
print("Generating plan...")

llm = OllamaLLM(model="qwen3:8b")
response_plan_txt = (prompt_planner | llm).invoke({"goal": user_goal})
response_plan_json = SuperJSON.loads(response_plan_txt)

plan = Plan(**response_plan_json)

# Step 2: Execute the Plan (Orchestrator/Workers - Omitted for brevity)
for step in plan.steps:
    print(f"Step {step.task_id}: {step.description} (Assignee: {step.assigned_to})")

如何学习大模型 AI ？

由于新岗位的生产效率，要优于被取代岗位的生产效率，所以实际上整个社会的生产效率是提升的。

但是具体到个人，只能说是：

“最先掌握AI的人，将会比较晚掌握AI的人有竞争优势”。

这句话，放在计算机、互联网、移动互联网的开局时期，都是一样的道理。

我在一线互联网企业工作十余年里，指导过不少同行后辈。帮助很多人得到了学习和成长。

我意识到有很多经验和知识值得分享给大家，也可以通过我们的能力和经验解答大家在人工智能学习中的很多困惑，所以在工作繁忙的情况下还是坚持各种整理和分享。但苦于知识传播途径有限，很多互联网行业朋友无法获得正确的资料得到学习提升，故此将并将重要的AI大模型资料包括AI大模型入门学习思维导图、精品AI大模型学习书籍手册、视频教程、实战学习等录播视频免费分享出来。

在这里插入图片描述