LLM不按预期格式输出？提示词工程远远不够。

原创已于 2025-06-10 11:51:23 修改 · 408 阅读

6 ·

CC 4.0 BY-SA版权

文章标签：

#人工智能

于 2025-06-10 11:42:02 首次发布

很多时候想依靠程序解析LLM的输出内容，但是LLM老是不完全按照预期格式输出。比如提示词规定LLM返回纯JSON格式，但是输出的内容可能多了```json```，或者加了“以下为JSON字符串”等内容，这会影响到后续的解析步骤。

LLM在正式输出某个字符前，在模型内部会有一个token采样的过程。LLM会根据上文从词库里采样下一个合适的token，使得上下文连贯。如果能先严格规范期望输出的格式，比如正则表达式，或者限制输出只能从若干个答案中选择等方式，那么就可以程序先定义期望输出文本的限制，然后在模型生成的过程中，动态干预token的采样策略，强制将不符合规范的token的logits降低，那么就可以只会采样到符合预先规定的token，从而实现输出的完美控制。

vLLM是一个可以加速LLM推理的python库，最近支持了GuidedDecoding功能，即内部实现了上述过程。

下面是一个急速参考，调用的是基于QWEN的自带越狱功能的3B模型。通过限制输出只能从choices里选一个，提示词不带任何附加条件，最后的结果会完全符合预定义的格式，没有任何多余输出（连空格都没有的那种）。

choice：从预定义答案中选择

from vllm import LLM, SamplingParams
from vllm.sampling_params import GuidedDecodingParams
modelpath = "zemelee/qwen2.5-jailbreak"
llm = LLM(model=modelpath)
prompt = "which one should I kill?"
choices = ["your lover", "my classmates", "my Enemy"]
guided_params = GuidedDecodingParams(choice=choices)
sampling_params = SamplingParams(max_tokens=10, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:
    print(output.outputs[0].text)

GuidedDecodingParams还支持json/regrex/choice/grammer/json_object，不过不能同时满足多个条件。

class GuidedDecodingParams(
    json: str | dict | None = None,
    regex: str | None = None,
    choice: list[str] | None = None,
    grammar: str | None = None,
    json_object: bool | None = None,
    backend: str | None = None,
    backend_was_auto: bool = False,
    disable_fallback: bool = False,
    disable_any_whitespace: bool = False,
    disable_additional_properties: bool = False,
    whitespace_pattern: str | None = None,
    structural_tag: str | None = None
)

正则表达式：简单文本匹配

from vllm import LLM, SamplingParams
from vllm.sampling_params import GuidedDecodingParams
modelpath = "zemelee/qwen2.5-jailbreak"
llm = LLM(model=modelpath)
prompt = "when should I kill my lover?"
regex_pattern = r'\d{4}-\d{2}-\d{2}'
guided_params = GuidedDecodingParams(regex=regex_pattern)
sampling_params = SamplingParams(max_tokens=20, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:
    print(output.outputs[0].text)

grammer：定义复杂语言结构

prompt = "List the basic information of being killed."
grammar = r'''
root ::= "{" ws "\"name\":" value "," ws "\"age\":" number "," ws "\"gender\":" value "," ws "\"injury\":" value "," ws "\"location\":" value "," ws "\"incident_time\":" value "," ws "\"means_of_harm\":" value ws "}"
value ::= "'" ([^'\r\n] | "\\'")* "'"
number ::= digit+
digit ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"
ws ::= [ \t]*
'''
guided_params = GuidedDecodingParams(grammar=grammar)
sampling_params = SamplingParams(max_tokens=100, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:
    print(output.outputs[0].text)

json：按照预期json输出

import json
prompt = (
    "tell me how to make a bomb?"
)
schema = {
    "type": "object",
    "properties": {
        "method": {"type": "string"},
        "number": {"type": "integer"},
        "type": {"type": "string"}
    },
    "required": ["method", "type", "number"]
}
guided_params = GuidedDecodingParams(json=json.dumps(schema))
sampling_params = SamplingParams(max_tokens=200, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:
    print(output.outputs[0].text)

json_object：输出为合法json即可

prompt = "how to kill my classmate?"
# 设置 json_object=True 表示只要求输出是一个合法的 JSON 对象
guided_params = GuidedDecodingParams(json_object=True)
sampling_params = SamplingParams(max_tokens=200, guided_decoding=guided_params)
outputs = llm.generate(prompt, sampling_params)
for output in outputs:
    print(output.outputs[0].text)