探索对话式RAG：构建智能问答系统的指南

最新推荐文章于 2025-05-29 20:01:18 发布

原创最新推荐文章于 2025-05-29 20:01:18 发布 · 824 阅读

8 ·

CC 4.0 BY-SA版权

文章标签：

#python

探索对话式RAG：构建智能问答系统的指南

引言

在构建智能问答应用时，我们通常希望实现一个能够进行来回对话的系统。这要求应用程序具有某种“记忆”，能够记住过去的问题和答案，并将其纳入当前的思考中。在这篇文章中，我们将探讨如何在问答系统中实现这种对话记忆，介绍检索增强生成（RAG）技术以及其实现方法。

主要内容

1. 理解检索增强生成（RAG）

RAG是结合检索技术和生成模型的一种方法。首先从外部知识库中获取相关信息，然后利用生成模型（如GPT）来生成最终答案。在对话环境中，这种方法特别有用，因为它允许系统通过检索上下文生成更准确的回答。

2. 构建链式RAG

链式RAG是一个简单且直接的方法。系统不断从知识库中检索信息，并将检索到的上下文与用户的问题一起输入到LLM中进行回答。

步骤：
1. 使用OpenAI的embedding将文档嵌入到向量空间。
2. 使用Chroma作为向量存储库。
3. 为LLM构建提示模板，这个模板会在检索到的上下文基础上生成答案。

import bs4
from langchain.chains import create_retrieval_chain
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings
from langchain_text_splitters import RecursiveCharacterTextSplitter

# 加载、切分、索引文档
loader = WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",))
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

# 构建问题回答链
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer "
    "the question. If you don't know the answer, say that you "
    "don't know. Use three sentences maximum and keep the "
    "answer concise."
)

# 生成RAG链
question_answer_chain = create_stuff_documents_chain(llm, prompt)
rag_chain = create_retrieval_chain(retriever, question_answer_chain)

3. 引入对话历史

在对话中，用户的问题可能需要上下文来理解。例如，当用户问“常见的方法是什么？”时，系统需要知道“它”指的是先前讨论的“任务分解”。

实现步骤：
- 更新提示，引入历史消息。
- 创建一个历史感知的检索器。

4. 构建代理型RAG

代理型RAG通过让LLM决定是否以及如何执行检索步骤，使系统更具适应性。代理可以直接生成检索的输入，并在必要时进行多个检索步骤。

实现步骤：
- 将检索器转换为LangChain工具。
- 使用LangGraph构建代理，自动管理内存和检索步骤。

代码示例

以下是实现一个基本的对话式问答应用的完整代码示例：

import bs4
from langchain.chains import create_retrieval_chain, create_history_aware_retriever
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain_chroma import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder

# 初始化LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# 配置检索器
loader = WebBaseLoader(web_paths=("https://lilianweng.github.io/posts/2023-06-23-agent/",))
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)
vectorstore = Chroma.from_documents(documents=splits, embedding=OpenAIEmbeddings())
retriever = vectorstore.as_retriever()

# 构建历史感知检索器
contextualize_q_system_prompt = (
    "Given a chat history and the latest user question which might reference context in the chat history, "
    "formulate a standalone question which can be understood without the chat history. Do NOT answer the question, "
    "just reformulate it if needed and otherwise return it as is."
)
contextualize_q_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", contextualize_q_system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
history_aware_retriever = create_history_aware_retriever(llm, retriever, contextualize_q_prompt)

# 构建回答链
system_prompt = (
    "You are an assistant for question-answering tasks. "
    "Use the following pieces of retrieved context to answer the question. "
    "If you don't know the answer, say that you don't know. Use three sentences maximum and keep the answer concise."
)

qa_prompt = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        MessagesPlaceholder("chat_history"),
        ("human", "{input}"),
    ]
)
question_answer_chain = create_stuff_documents_chain(llm, qa_prompt)

rag_chain = create_retrieval_chain(history_aware_retriever, question_answer_chain)