结构化输出允许智能体(agent)以特定且可预测的格式返回数据。
与解析自然语言响应不同,你将直接获得结构化数据——例如 JSON 对象、Pydantic 模型 或 dataclass——这些数据可被你的应用程序直接使用。
LangChain 的 create_agent 能自动处理结构化输出。
用户只需指定所需的结构化输出 Schema(模式),当模型生成结构化数据时,它会被捕获、验证,并存入智能体状态(agent state)中的 ‘structured_response’ 键下。
def create_agent(
...
response_format: Union[
ToolStrategy[StructuredResponseT],
ProviderStrategy[StructuredResponseT],
type[StructuredResponseT],
None,
]
响应格式(Response Format)
使用 response_format 参数控制智能体如何返回结构化数据:
- ToolStrategy[StructuredResponseT]:通过工具调用(tool calling)实现结构化输出
- ProviderStrategy[StructuredResponseT]:使用模型提供商原生支持的结构化输出
- type[StructuredResponseT]:直接传入 Schema 类型 —— LangChain 会根据模型能力自动选择最佳策略
- None:未显式请求结构化输出
当你直接传入一个 Schema 类型时,LangChain 会自动选择:
- 若所选模型和提供商支持原生结构化输出(例如 OpenAI、Anthropic (Claude) 或 xAI (Grok)),则使用 ProviderStrategy;
- 其他所有模型则使用 ToolStrategy。
custom_profile = {
"structured_output": True,
# ...
}
model = init_chat_model("...", profile=custom_profile)
注意:如果同时指定了工具(tools),模型必须支持 工具调用与结构化输出同时使用。
结构化响应最终会出现在智能体最终状态的 structured_response 键中。
提供商策略(Provider Strategy)
部分模型提供商(如 OpenAI、xAI (Grok)、Gemini、Anthropic (Claude))在其 API 中原生支持结构化输出。这是最可靠的方式(若可用)。
要使用此策略,需配置一个 ProviderStrategy:
class ProviderStrategy(Generic[SchemaT]):
schema: type[SchemaT]
strict: bool | None = None
- schema
required
定义结构化输出格式的模式。支持:
- Pydantic 模型:继承自 BaseModel 的类,带字段验证。返回经过验证的 Pydantic 实例。
- Dataclass:带类型注解的 Python dataclass。返回字典(dict)。
- TypedDict:带类型的字典类。返回字典(dict)。
- JSON Schema:符合 JSON Schema 规范的字典。返回字典(dict)。
- strict
可选的布尔型参数,用于启用严格的模式遵循。部分提供商(如 OpenAI 和 xAI)支持此功能。默认值为 None(即禁用)。
当你将 Schema 类型直接传给 create_agent.response_format,且模型支持原生结构化输出时,LangChain 会自动使用 ProviderStrategy:
- Pydantic Model
from pydantic import BaseModel, Field
from langchain.agents import create_agent
class ContactInfo(BaseModel):
"""Contact information for a person."""
name: str = Field(description="The name of the person")
email: str = Field(description="The email address of the person")
phone: str = Field(description="The phone number of the person")
agent = create_agent(
model="gpt-5",
response_format=ContactInfo # Auto-selects ProviderStrategy
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})
print(result["structured_response"])
# ContactInfo(name='John Doe', email='john@example.com', phone='(555) 123-4567')
- Dataclass
from dataclasses import dataclass
from langchain.agents import create_agent
@dataclass
class ContactInfo:
"""Contact information for a person."""
name: str # The name of the person
email: str # The email address of the person
phone: str # The phone number of the person
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ContactInfo # Auto-selects ProviderStrategy
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})
result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}
- TypedDict
from typing_extensions import TypedDict
from langchain.agents import create_agent
class ContactInfo(TypedDict):
"""Contact information for a person."""
name: str # The name of the person
email: str # The email address of the person
phone: str # The phone number of the person
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ContactInfo # Auto-selects ProviderStrategy
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})
result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}
- JSON Schema
from langchain.agents import create_agent
contact_info_schema = {
"type": "object",
"description": "Contact information for a person.",
"properties": {
"name": {"type": "string", "description": "The name of the person"},
"email": {"type": "string", "description": "The email address of the person"},
"phone": {"type": "string", "description": "The phone number of the person"}
},
"required": ["name", "email", "phone"]
}
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ProviderStrategy(contact_info_schema)
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Extract contact info from: John Doe, john@example.com, (555) 123-4567"}]
})
result["structured_response"]
# {'name': 'John Doe', 'email': 'john@example.com', 'phone': '(555) 123-4567'}
由于模型提供商在 API 层强制执行 Schema,因此原生结构化输出具有高可靠性和严格验证。建议在支持时优先使用。
无论哪种写法,只要当前模型不支持结构化输出,智能体会自动回退到工具调用策略(tool calling strategy)。
工具调用策略(Tool Calling Strategy)
对于不支持原生结构化输出的模型,LangChain 会通过工具调用来实现相同效果。该策略适用于所有支持工具调用的模型(绝大多数现代模型都支持)。
要使用此策略,需配置一个 ToolStrategy:
class ToolStrategy(Generic[SchemaT]):
schema: type[SchemaT]
tool_message_content: str | None
handle_errors: Union[
bool,
str,
type[Exception],
tuple[type[Exception], ...],
Callable[[Exception], str],
]
- Pydantic Model
from pydantic import BaseModel, Field
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class ProductReview(BaseModel):
"""Analysis of a product review."""
rating: int | None = Field(description="The rating of the product", ge=1, le=5)
sentiment: Literal["positive", "negative"] = Field(description="The sentiment of the review")
key_points: list[str] = Field(description="The key points of the review. Lowercase, 1-3 words each.")
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ToolStrategy(ProductReview)
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]
# ProductReview(rating=5, sentiment='positive', key_points=['fast shipping', 'expensive'])
- Dataclass
from dataclasses import dataclass
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
@dataclass
class ProductReview:
"""Analysis of a product review."""
rating: int | None # The rating of the product (1-5)
sentiment: Literal["positive", "negative"] # The sentiment of the review
key_points: list[str] # The key points of the review
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ToolStrategy(ProductReview)
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]
# {'rating': 5, 'sentiment': 'positive', 'key_points': ['fast shipping', 'expensive']}
- TypedDict
from typing import Literal
from typing_extensions import TypedDict
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class ProductReview(TypedDict):
"""Analysis of a product review."""
rating: int | None # The rating of the product (1-5)
sentiment: Literal["positive", "negative"] # The sentiment of the review
key_points: list[str] # The key points of the review
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ToolStrategy(ProductReview)
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]
# {'rating': 5, 'sentiment': 'positive', 'key_points': ['fast shipping', 'expensive']}
- JSON Schema
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
product_review_schema = {
"type": "object",
"description": "Analysis of a product review.",
"properties": {
"rating": {
"type": ["integer", "null"],
"description": "The rating of the product (1-5)",
"minimum": 1,
"maximum": 5
},
"sentiment": {
"type": "string",
"enum": ["positive", "negative"],
"description": "The sentiment of the review"
},
"key_points": {
"type": "array",
"items": {"type": "string"},
"description": "The key points of the review"
}
},
"required": ["sentiment", "key_points"]
}
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ToolStrategy(product_review_schema)
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]
# {'rating': 5, 'sentiment': 'positive', 'key_points': ['fast shipping', 'expensive']}
- Union Types
from pydantic import BaseModel, Field
from typing import Literal, Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class ProductReview(BaseModel):
"""Analysis of a product review."""
rating: int | None = Field(description="The rating of the product", ge=1, le=5)
sentiment: Literal["positive", "negative"] = Field(description="The sentiment of the review")
key_points: list[str] = Field(description="The key points of the review. Lowercase, 1-3 words each.")
class CustomerComplaint(BaseModel):
"""A customer complaint about a product or service."""
issue_type: Literal["product", "service", "shipping", "billing"] = Field(description="The type of issue")
severity: Literal["low", "medium", "high"] = Field(description="The severity of the complaint")
description: str = Field(description="Brief description of the complaint")
agent = create_agent(
model="gpt-5",
tools=tools,
response_format=ToolStrategy(Union[ProductReview, CustomerComplaint])
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Analyze this review: 'Great product: 5 out of 5 stars. Fast shipping, but expensive'"}]
})
result["structured_response"]
# ProductReview(rating=5, sentiment='positive', key_points=['fast shipping', 'expensive'])
自定义工具消息内容
tool_message_content 参数允许你自定义生成结构化输出时,在对话历史中显示的消息内容:
from pydantic import BaseModel, Field
from typing import Literal
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class MeetingAction(BaseModel):
"""从会议记录中提取的待办事项"""
task: str = Field(description="具体任务")
assignee: str = Field(description="负责人")
priority: Literal["low", "medium", "high"] = Field(description="优先级")
agent = create_agent(
model="gpt-5",
tools=[],
response_format=ToolStrategy(
schema=MeetingAction,
tool_message_content="待办事项已捕获并添加至会议纪要!"
)
)
agent.invoke({
"messages": [{"role": "user", "content": "根据会议记录:Sarah 需要尽快更新项目时间表"}]
})
输出示例:
================================ Human Message =================================
From our meeting: Sarah needs to update the project timeline as soon as possible
================================== Ai Message ==================================
Tool Calls:
MeetingAction (call_1)
Call ID: call_1
Args:
task: Update the project timeline
assignee: Sarah
priority: high
================================= Tool Message =================================
Name: MeetingAction
待办事项已捕获并添加至会议纪要!
若未设置 tool_message_content,默认的 ToolMessage 内容为:
================================= Tool Message =================================
Name: MeetingAction
Returning structured response: {'task': 'update the project timeline', 'assignee': 'Sarah', 'priority': 'high'}
错误处理(Error Handling)
模型在通过工具调用生成结构化输出时可能会出错。LangChain 提供了智能重试机制来自动处理这些错误。
多结构化输出错误
当模型错误地调用了多个结构化输出工具时,智能体会在 ToolMessage 中返回错误反馈,并提示模型重试:
from pydantic import BaseModel, Field
from typing import Union
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class ContactInfo(BaseModel):
name: str = Field(description="Person's name")
email: str = Field(description="Email address")
class EventDetails(BaseModel):
event_name: str = Field(description="Name of the event")
date: str = Field(description="Event date")
agent = create_agent(
model="gpt-5",
tools=[],
response_format=ToolStrategy(Union[ContactInfo, EventDetails]) # Default: handle_errors=True
)
agent.invoke({
"messages": [{"role": "user", "content": "Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th"}]
})
================================ Human Message =================================
Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th
None
================================== Ai Message ==================================
Tool Calls:
ContactInfo (call_1)
Call ID: call_1
Args:
name: John Doe
email: john@email.com
EventDetails (call_2)
Call ID: call_2
Args:
event_name: Tech Conference
date: March 15th
================================= Tool Message =================================
Name: ContactInfo
Error: Model incorrectly returned multiple structured responses (ContactInfo, EventDetails) when only one is expected.
Please fix your mistakes.
================================= Tool Message =================================
Name: EventDetails
Error: Model incorrectly returned multiple structured responses (ContactInfo, EventDetails) when only one is expected.
Please fix your mistakes.
================================== Ai Message ==================================
Tool Calls:
ContactInfo (call_3)
Call ID: call_3
Args:
name: John Doe
email: john@email.com
================================= Tool Message =================================
Name: ContactInfo
Returning structured response: {'name': 'John Doe', 'email': 'john@email.com'}
Schema 验证错误
当结构化输出不符合预期 Schema 时,智能体会提供具体的错误反馈:
from pydantic import BaseModel, Field
from langchain.agents import create_agent
from langchain.agents.structured_output import ToolStrategy
class ProductRating(BaseModel):
rating: int | None = Field(description="Rating from 1-5", ge=1, le=5)
comment: str = Field(description="Review comment")
agent = create_agent(
model="gpt-5",
tools=[],
response_format=ToolStrategy(ProductRating), # Default: handle_errors=True
system_prompt="You are a helpful assistant that parses product reviews. Do not make any field or value up."
)
agent.invoke({
"messages": [{"role": "user", "content": "Parse this: Amazing product, 10/10!"}]
})
================================ Human Message =================================
Parse this: Amazing product, 10/10!
================================== Ai Message ==================================
Tool Calls:
ProductRating (call_1)
Call ID: call_1
Args:
rating: 10
comment: Amazing product
================================= Tool Message =================================
Name: ProductRating
Error: Failed to parse structured output for tool 'ProductRating': 1 validation error for ProductRating.rating
Input should be less than or equal to 5 [type=less_than_equal, input_value=10, input_type=int].
Please fix your mistakes.
================================== Ai Message ==================================
Tool Calls:
ProductRating (call_2)
Call ID: call_2
Args:
rating: 5
comment: Amazing product
================================= Tool Message =================================
Name: ProductRating
Returning structured response: {'rating': 5, 'comment': 'Amazing product'}
错误处理策略
你可以通过 handle_errors 参数自定义错误处理方式:
ToolStrategy(
schema=ProductRating,
handle_errors="Please provide a valid rating between 1-5 and include a comment."
)
如果 handle_errors 是一个字符串,则agent将始终提示模型使用固定的提示信息重新尝试:
================================= Tool Message =================================
Name: ProductRating
Please provide a valid rating between 1-5 and include a comment.
仅处理特定的异常情况:
ToolStrategy(
schema=ProductRating,
handle_errors=ValueError # Only retry on ValueError, raise others
)
如果 handle_errors 是一个异常类型,那么代理程序只有在抛出的异常为指定类型时才会进行重试(使用默认的错误消息)。在其他所有情况下,该异常将会被抛出。
处理多种异常类型:
ToolStrategy(
schema=ProductRating,
handle_errors=(ValueError, TypeError) # Retry on ValueError and TypeError
)
如果 handle_errors 是一个异常类型组成的元组,那么代理程序仅在所引发的异常属于指定类型之一时才会重新尝试(使用默认的错误消息)。在其他所有情况下,该异常将会被抛出。
自定义错误处理函数:
from langchain.agents.structured_output import StructuredOutputValidationError
from langchain.agents.structured_output import MultipleStructuredOutputsError
def custom_error_handler(error: Exception) -> str:
if isinstance(error, StructuredOutputValidationError):
return "There was an issue with the format. Try again."
elif isinstance(error, MultipleStructuredOutputsError):
return "Multiple structured outputs were returned. Pick the most relevant one."
else:
return f"Error: {str(error)}"
agent = create_agent(
model="gpt-5",
tools=[],
response_format=ToolStrategy(
schema=Union[ContactInfo, EventDetails],
handle_errors=custom_error_handler
) # Default: handle_errors=True
)
result = agent.invoke({
"messages": [{"role": "user", "content": "Extract info: John Doe (john@email.com) is organizing Tech Conference on March 15th"}]
})
for msg in result['messages']:
# If message is actually a ToolMessage object (not a dict), check its class name
if type(msg).__name__ == "ToolMessage":
print(msg.content)
# If message is a dictionary or you want a fallback
elif isinstance(msg, dict) and msg.get('tool_call_id'):
print(msg['content'])
On StructuredOutputValidationError:
================================= Tool Message =================================
Name: ToolStrategy
There was an issue with the format. Try again.
On MultipleStructuredOutputsError:
================================= Tool Message =================================
Name: ToolStrategy
Multiple structured outputs were returned. Pick the most relevant one.
On other errors:
================================= Tool Message =================================
Name: ToolStrategy
Error: <error message>
没有错误处理:
response_format = ToolStrategy(
schema=ProductRating,
handle_errors=False # All errors raised
)
45

被折叠的 条评论
为什么被折叠?



