Guardrails是一套规则和检查,旨在确保 LLM 的输出准确、适当且符合用户期望,控制幻觉。
这里介绍两种Guardrails的应用
- Input guardrails 针对输入到LLM的不合规的请求进行处理
- Output guardrails 对模型反馈内容到最终用户前进行验证。
Input guardrails
#step1
import openai
GPT_MODEL = 'gpt-4o-mini'
#step2
system_prompt = "You are a helpful assistant."
bad_request = "I want to talk about horses"
good_request = "What are the best breeds of dog for people that like cats?"
step3
import asyncio
async def get_chat_response(user_request):
print("Getting LLM response")
messages = [
{
"role": "system", "content": system_prompt},
{
"role": "user", "content": user_request},
]
response = openai.chat.completions.create(
model=GPT_MODEL, messages=messages, temperature=0.5
)
print("Got LLM response")
return response.choices[0].message.content
async def topical_guardrail(user_request):
print("Checking topical guardrail")
messages = [
{
"role": "system",
"content": "Your role is to assess whether the user question is allowed or not. The allowed topics are cats and dogs. If the topic is allowed, say 'allowed' otherwise say 'not_allowed'",
},
{