LangChain 高效实现 LLM 结构化输出:代码、技巧与对比分析

简介

结合langchain官方文档实现了一遍大模型的结构化输出。
相比官方文档,额外增加了prompt模板填充与few-shot提示。

开源

github 只有代码,没有数据。网盘链接包括数据示例。

大模型结构化输出

from langchain_openai import ChatOpenAI

# 支持本地部署的大模型
# llm = ChatOpenAI(base_url="http://127.0.0.1:8000/v1/", model="gpt-3.5-turbo") 
llm_mini = ChatOpenAI(model="gpt-4o-mini")
llm_turbo = ChatOpenAI(model="gpt-3.5-turbo")

OPENAI_API_BASE 与 OPEN_API_KEY 参数已添加到系统环境变量中,故无需显式传参。

from typing import List
from typing import Optional

from pydantic import BaseModel, Field


# Pydantic
class Joke(BaseModel):
    """Joke to tell user."""

    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: Optional[int] = Field(
        default=None, description="How funny the joke is, from 1 to 10"
    )

Joke 是大模型要结构化输出的对象。

# 另外一种 Joke 的写法
from typing import Optional

from typing_extensions import Annotated, TypedDict


# TypedDict
class Joke(TypedDict):
    """Joke to tell user."""

    setup: Annotated[str, ..., "The setup of the joke"]

    # Alternatively, we could have specified setup as:

    # setup: str                    # no default, no description
    # setup: Annotated[str, ...]    # no default, no description
    # setup: Annotated[str, "foo"]  # default, no description

    punchline: Annotated[str, ..., "The punchline of the joke"]
    rating: Annotated[Optional[int], None, "How funny the joke is, from 1 to 10"]

若 llm 是 gpt-3.5-turbo,可成功得到输出,若是gpt-4o-mini 则会报错。gpt-4o-mini 也具备function call与tool use的能力,但是能力逊色一些。

llm_turbo.with_structured_output(Joke).invoke("Tell me a joke about cats")

输出:

Joke(setup='Why was the cat sitting on the computer?', punchline='Because it wanted to keep an eye on the mouse!', rating=7)
## 会报错
# llm_mini.with_structured_output(Joke).invoke("Tell me a joke about cats")

json 格式的 Joke:

json_schema = {
    "title": "joke",
    "description": "Joke to tell user.",
    "type": "object",
    "properties": {
        "setup": {
            "type": "string",
            "description": "The setup of the joke",
        },
        "punchline": {
            "type": "string",
            "description": "The punchline to the joke",
        },
        "rating": {
            "type": "integer",
            "description": "How funny the joke is, from 1 to 10",
            "default": None,
        },
    },
    "required": ["setup", "punchline"],
}
llm_turbo.with_structured_output(json_schema).invoke("Tell me a joke about cats")

输出:

{'setup': 'Why was the cat sitting on the computer?',
 'punchline': 'Because it wanted to keep an eye on the mouse!'}

大模型按照上述json格式调用大模型,也能成功解析得到结构化对象的输出

绑定多个结构对象

from typing import Union

class Man(BaseModel):
    """
    男人的信息
    """

    name: str = Field(description="姓名")
    age: str = Field(description="年龄")
    interest: str = Field(description="兴趣爱好")
    colthing: str = Field(description="上身衣服与下身衣服")
    height: str = Field(description="身高")


class Woman(BaseModel):
    """
    女人的信息
    """

    name: str = Field(description="姓名")
    age: str = Field(description="年龄")
    interest: str = Field(description="兴趣爱好")
    colthing: str = Field(description="上身衣服与下身衣服")
    height: str = Field(description="身高")



class Person(BaseModel):
    final_output: Union[Man, Woman]
llm_turbo.with_structured_output(Person).invoke("帮我生成一个男人的信息")

输出:

Person(final_output=Man(name='张伟', age='30', interest='运动,旅行,读书', colthing='白色衬衫,深色牛仔裤', height='175cm'))
llm_turbo.with_structured_output(Person).invoke("帮我生成一个女人的信息")

输出:

Person(final_output=Man(name='李华', age='28', interest='阅读,旅行,烹饪', colthing='白色衬衫和黑色裙子', height='165cm'))

with_structured_output 如果没有解析到对应的结构化对象,则返回None。

Few-shot prompting

在前面的例子中,发现 gpt-4o-mini模型无法正确解析出结构化的对象,故使用少样本提示增强gpt-4o-mini模型的结构化输出能力。

写法一,直接写入到提示词中:

from langchain_core.prompts import ChatPromptTemplate, PromptTemplate

system_msg = """You are a hilarious comedian. Your specialty is knock-knock jokes. 
Return a joke which has the setup (the response to "Who's there?") and the final punchline (the response to "<setup> who?")."""

examples = """
example_user: Tell me a joke about planes
example_assistant: {{"setup": "Why don't planes ever get tired?", "punchline": "Because they have rest wings!", "rating": 2}}

example_user: Tell me another joke about planes
example_assistant: {{"setup": "Cargo", "punchline": "Cargo 'vroom vroom', but planes go 'zoom zoom'!", "rating": 10}}

example_user: Now about caterpillars
example_assistant: {{"setup": "Caterpillar", "punchline": "Caterpillar really slow, but watch me turn into a butterfly and steal the show!", "rating": 5}}
""".strip()

prompt = PromptTemplate.from_template(
"""
{system_msg}

Here are some examples of jokes:
{examples}

example_user: {input}
""".strip()
)

# prompt = ChatPromptTemplate.from_messages([("system", system), ("human", "{input}")])

由于使用了langchain的ChatPromptTemplate,故需要在提示词中使用{{{进行转义。

langchain的提示词模板可以使用 invokeformat 进行提示词填充
查看填充完成后的提示词:

prompt.invoke({
    "system_msg": system_msg,
    "examples": examples,
    "input": "what's something funny about woodpeckers",
}).messages
print(
    prompt.format(
        system_msg=system_msg,
        examples=examples,
        input="what's something funny about woodpeckers",
    )
)

提示词输出,上述 invoke 与 format方法输出的提示词都是下述结果:

You are a hilarious comedian. Your specialty is knock-knock jokes. 
Return a joke which has the setup (the response to "Who's there?") and the final punchline (the response to "<setup> who?").

Here are some examples of jokes:
example_user: Tell me a joke about planes
example_assistant: {{"setup": "Why don't planes ever get tired?", "punchline": "Because they have rest wings!", "rating": 2}}

example_user: Tell me another joke about planes
example_assistant: {{"setup": "Cargo", "punchline": "Cargo 'vroom vroom', but planes go 'zoom zoom'!", "rating": 10}}

example_user: Now about caterpillars
example_assistant: {{"setup": "Caterpillar", "punchline": "Caterpillar really slow, but watch me turn into a butterfly and steal the show!", "rating": 5}}
example_user: what's something funny about woodpeckers

调用大模型:

few_shot_chain1 = prompt | llm_mini.with_structured_output(Joke)
few_shot_chain1.invoke(
    {
        "system_msg": system_msg,
        "examples": examples,
        "input": "what's something funny about woodpeckers",
    }
)

输出:

Joke(setup='Woodpecker', punchline="Woodpecker knocking at your door? It's just trying to show you its new peck-formance!", rating=7)

gpt-4o-mini 模型经过少样本提示,就可以成功解析出结构化对象了。


写法二FewShotPromptTemplate

import json
from langchain_core.prompts import PromptTemplate, FewShotPromptTemplate

system_msg = """You are a hilarious comedian. Your specialty is knock-knock jokes. \
Return a joke which has the setup (the response to "Who's there?") and the final punchline (the response to "<setup> who?").

Here are some examples of jokes:""".strip()

# 定义格式化单个示例的 PromptTemplate
example_prompt = PromptTemplate(
    template="Q: {query}\nA: {{{answer}}}",
)

# 示例数据
examples = [
    {
        "query": "Tell me a joke about planes",
        "answer": {"setup": "Why don\'t planes ever get tired?", "punchline": "Because they have rest wings!", "rating": 2},
    },
    {
        "query": "Tell me another joke about planes",
        "answer": {"setup": "Cargo", "punchline": "Cargo 'vroom vroom', but planes go 'zoom zoom'!", "rating": 10},
    }
]

for example in examples:
    example["answer"] = json.dumps(example["answer"])

# 构建 FewShotPromptTemplate
few_shot_prompt2 = FewShotPromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
    prefix=system_msg,
    suffix="Q: {input}\nA:",
)
print(few_shot_prompt2.invoke({"input": "Now about caterpillars"}).text)

提示词输出:

You are a hilarious comedian. Your specialty is knock-knock jokes. Return a joke which has the setup (the response to "Who's there?") and the final punchline (the response to "<setup> who?").

Here are some examples of jokes:

Q: Tell me a joke about planes
A: {'setup': "Why don't planes ever get tired?", 'punchline': 'Because they have rest wings!', 'rating': 2}

Q: Tell me another joke about planes
A: {'setup': 'Cargo', 'punchline': "Cargo 'vroom vroom', but planes go 'zoom zoom'!", 'rating': 10}

Q: Now about caterpillars
A:

值得注意的是 examples的answer是一个字典。为了不让langchain报错,针对 PromptTemplate 我是这些写的:

"Q: {query}\nA: {{{answer}}}"

使用了三个括号,把answer包住。大家记住{{{的转义,然后再去理解三个括号就行了。

few_shot_chain2 = few_shot_prompt2 | llm_mini.with_structured_output(Joke)
few_shot_chain2.invoke(
    {
        "input": "what's something funny about woodpeckers",
    }
)

输出:

Joke(setup='Woodpecker', punchline="Woodpecker who's always knocking on wood for good luck!", rating=8)

PydanticOutputParser

并不是所有模型都支持 with_structured_output,有一些模型的function call的能力差很多,那么便可以使用 PydanticOutputParser 解析出结构化对象(需要在提示词中指定返回格式)。

from langchain.output_parsers import PydanticOutputParser

from langchain_core.prompts import PromptTemplate

prompt_template = PromptTemplate.from_template("""
    生成一个男人的信息。以json格式返回,包含姓名、年龄、兴趣爱好、上身衣服与下身衣服、身高。格式如下:
    ```json
    {{
        "name": "张三",
        "age": "20",
        "interest": "打篮球",
        "colthing": "白色T恤与黑色裤子",
        "height": "180cm"
    }}
    ```
    """.strip()
)
parser = PydanticOutputParser(pydantic_object=Man)
chain = prompt_template | llm_mini | parser
man_info = chain.invoke({})
# 判断是否为空
if man_info:
    print(man_info)
else:
    print("没有返回结果")

输出:

Man(name='李四', age='28', interest='旅游与摄影', colthing='蓝色衬衫与卡其色长裤', height='175cm')

文本分类评估例子

大模型通过with_structured_output的方式获取结构化输出,本文想评估一下,使用function call获取结构化输出的方式是否会造成大模型的输出效果下降。

大模型实现文本分类的例子:

# 实现一个文本分类例子
from langchain_core.prompts import PromptTemplate, FewShotPromptTemplate

class TextCLS(BaseModel):
    """
    文本分类的结构化输出
    """

    keyword: List[str] = Field(description="问题中出现的与分类相关的关键词")
    reason: str = Field(description="分类的原因")
    label: str = Field(description="文本分类label")
    
schema = ["经济", "民生", "产业", "绿色发展", "军事", "其他"]

system_msg = "请你完成文本分类任务,按照要求完成关键词提取,输出分类原因与最终的类别。文本的类别是:{schema}"

# 定义格式化单个示例的 PromptTemplate
example_prompt = PromptTemplate(
    template="Q: {query}\nA: {answer}",
)

# 示例数据
examples = [
    {
        "query": "武汉市今年GDP上涨2%",
        "answer": '{{"keyword": ["GDP"], "reason": "GDP与经济相关", "label": "经济"}}',
    },
    {
        "query": "氢能产业园区的相关配套措施完善,园区内有很多氢能领域龙头企业",
        "answer": """{{
                "keyword": ["氢能产业园区", "氢能领域龙头企业"],
                "reason": "问题中的氢能产业园区与氢能领域龙头企业都与产业相关",
                "label": "产业",
            }}""".strip(),
    },
]

# 构建 FewShotPromptTemplate
few_shot_prompt = FewShotPromptTemplate(
    example_prompt=example_prompt,
    examples=examples,
    prefix=system_msg,
    suffix="Q: {input}\nA:",
)

一次性传递全部参数,每次都要传schema,过于繁琐:

prompt = few_shot_prompt.invoke(
    {
        "input": "武汉市今年GDP上涨2%",
        "schema": schema,
    }
)
print(prompt.text)

部分提示词: 设置文本分类的label,以后就不用每一次都传递schema进去,只需要输入问题即可:

partial_prompt = few_shot_prompt.partial(schema=schema)
partial_prompt.invoke({"input":"武汉市今年GDP上涨2%"}) # 输出提示词
chanin = partial_prompt | llm_mini.with_structured_output(TextCLS)
chanin.invoke("北京市今年的生产总值提高了5个百分点")

输出:

TextCLS(keyword=['生产总值'], reason='生产总值与经济相关,反映经济增长情况', label='经济')

文本分类评估项目

调用在线大模型的API速度太慢,选择调用本地大模型,速度会快一点。

llamafactory api 部署本地大模型,部署脚本:

qwen2.5_7B.yaml:

model_name_or_path: Qwen/Qwen2.5-7B-Instruct
template: qwen
infer_backend: vllm
vllm_enforce_eager: true

llamafactory-cli api qwen2.5_7B.yaml

三种方法:

  • 大模型 + re 正则匹配 llm_infer.py
  • 大模型结构化输出 with_structured_output llm_structured_output_infer.py
  • vllm_infer 推理 vllm_infer/

vllm_infer 推理

大模型推理速度很快,但是步骤繁琐一些

  • vllm_infer/1.build_vllm_dataset.ipynb 构建 alpaca 样式的数据集 ag_news_test,方便 llamafactory直接加载

  • vllm_infer/infer.sh 大模型 vllm_infer 推理的脚本:

    python vllm_infer.py \
                --model_name_or_path Qwen/Qwen2.5-7B-Instruct \
                --template qwen \
                --dataset "ag_news_test" \
                --dataset_dir data \
                --save_name output/vllm_ag_news_test.json \
                > logs/vllm_ag_news_test.log 2>&1
    

    vllm_infer.py 来自 https://github.com/hiyouga/LLaMA-Factory/blob/main/scripts/vllm_infer.py

评估

llm_result_eval.ipynb: 对三种方法进行评估

methodprecisionrecallf1support/数量processing_time/s
1llm0.8411830.8070350.7942249951600
2vllm_infer0.8413790.8098100.79748499941
3llm_struct0.8382130.8088090.8037439991339
  • llm: 大模型api调用 + re 正则表达式匹配输出结果
  • vllm_infer: vllm_infer 推理 + re 正则表达式匹配输出结果
  • llm_struct: with_structured_output 结构化输出

从结果来看,三种方法做文本分类的precision、recall、f1的效果都差不多。

  • support代表正确的结构化输出的数量,随机采样了1000条数据作为评估的数据集,三种方法结构化输出的能力都达到了99%。
  • processing_time 代表推理速度,单位是秒。llmllm_struct`的API调用,都采取的同步调用,故速度会慢很多,若采取异步速度会快三倍。当然 vllm_infer 的速度是最快的,只需要41秒,另外两种同步API调用速度都要将近半小时。

总结

采用了 function call 的 with_structured_output 的文本分类效果与其他方法都差不多。在提示词写准确后,with_structured_output 可以放心使用,能简化代码,不用自己再写正则表达式抽取大模型返回的结果。

从速度方面看,采用 vllm_infer 速度最快,而且文本分类效果也不会下降。

虽然 with_structured_output API调用速度很慢,但相比 vllm_infer 在 langgraph的agent、工作流编排方面,使用会方便很多。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

jieshenai

为了遇见更好的文章

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值