大模型幻觉（Hallucination）控制方法Guardrails

最新推荐文章于 2025-04-18 22:48:12 发布

技术与健康

最新推荐文章于 2025-04-18 22:48:12 发布

阅读量702

点赞数 3

CC 4.0 BY-SA版权

分类专栏： LLM 文章标签：人工智能机器学习

本文为博主原创文章，未经博主允许不得转载。

本文链接：https://blog.youkuaiyun.com/Practicer2015/article/details/141069563

Guardrails是一套规则和检查，旨在确保 LLM 的输出准确、适当且符合用户期望，控制幻觉。

这里介绍两种Guardrails的应用

Input guardrails 针对输入到LLM的不合规的请求进行处理
Output guardrails 对模型反馈内容到最终用户前进行验证。

Input guardrails

#step1
import openai

GPT_MODEL = 'gpt-4o-mini'
#step2
system_prompt = "You are a helpful assistant."

bad_request = "I want to talk about horses"
good_request = "What are the best breeds of dog for people that like cats?"

step3
import asyncio


async def get_chat_response(user_request):
    print("Getting LLM response")
    messages = [
        {
   "role": "system", "content": system_prompt},
        {
   "role": "user", "content": user_request},
    ]
    response = openai.chat.completions.create(
        model=GPT_MODEL, messages=messages, temperature=0.5
    )
    print("Got LLM response")

    return response.choices[0].message.content


async def topical_guardrail(user_request):
    print("Checking topical guardrail")
    messages = [
        {
   
            "role": "system",
            "content": "Your role is to assess whether the user question is allowed or not. The allowed topics are cats and dogs. If the topic is allowed, say 'allowed' otherwise say 'not_allowed'",
        },
        {