如何为RAG应用中的生成内容添加引用

最新推荐文章于 2025-12-08 11:10:05 发布

原创最新推荐文章于 2025-12-08 11:10:05 发布 · 628 阅读

3 ·

CC 4.0 BY-SA版权

文章标签：

#python #windows #linux

部署运行你感兴趣的模型镜像

在AI应用中，尤其是基于检索增强生成（Retrieval-Augmented Generation, RAG）的应用中，为生成的答案添加引用是一项重要任务。这不仅可以提高答案的可信度，还能让用户追溯信息来源。本指南将介绍五种方法，帮助您为RAG应用中的生成内容添加引用：

使用工具调用标注文档ID
使用工具调用标注文档ID及文本片段
直接提示模型生成引用
检索后处理（如对检索到的上下文进行压缩以提高其相关性）
生成后处理（通过第二次调用LLM对答案注释引用）

我们建议根据模型支持的特性，从上到下选择最合适的方法。如果模型支持工具调用，优先选择方法1或2；否则选择后续的方法。

以下是每种方法的详细实现以及代码示例。

技术背景与核心原理

RAG结合了信息检索和生成模型的优势，通过查询检索外部数据源中的文档，并将这些文档传递给生成式语言模型(LLM)，为用户生成包含上下文的答案。然而，默认情况下，LLM生成的内容可能缺乏引用来源。因此，我们需要额外的机制来确保答案中包含引用，并指向具体的数据片段。

方法1: 使用工具调用标注文档ID

通过工具调用功能，模型可以返回具体信息来源的标识符。这需要模型支持结构化输出，例如OpenAI GPT-4等。

实现代码

from langchain_core.pydantic_v1 import BaseModel, Field
from typing import List

class CitedAnswer(BaseModel):
    answer: str = Field(...)
    citations: List[int] = Field(...)

structured_llm = llm.with_structured_output(CitedAnswer)

def format_docs_with_id(docs: List[Document]) -> str:
    formatted = [
        f"Source ID: {i}\nArticle Title: {doc.metadata['title']}\nArticle Snippet: {doc.page_content}"
        for i, doc in enumerate(docs)
    ]
    return "\n\n" + "\n\n".join(formatted)

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs_with_id(x["context"])))
    | prompt
    | structured_llm
)

result = rag_chain_from_docs.invoke({"input": "How fast are cheetahs?"})
print(result['answer'])

优势

精确标明信息来源。
模型输出容易解析，用于用户端展示。

方法2: 标注文档ID和文本片段

除了文档ID外，返回具体的文本片段可以进一步增强答案的可追溯性。

实现代码

class Citation(BaseModel):
    source_id: int = Field(...)
    quote: str = Field(...)

class QuotedAnswer(BaseModel):
    answer: str = Field(...)
    citations: List[Citation] = Field(...)

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs_with_id(x["context"])))
    | prompt
    | llm.with_structured_output(QuotedAnswer)
)

result = rag_chain_from_docs.invoke({"input": "How fast are cheetahs?"})
print(result["answer"])

输出示例

QuotedAnswer(
    answer='Cheetahs can run at speeds of 93 to 104 km/h.',
    citations=[Citation(source_id=0, quote='The cheetah is capable of running at 93 to 104 km/h...')]
)

优势

提供完整的引用链条，支持用户快速验证。

方法3: 直接提示模型生成引用

如果模型不支持工具调用，可以通过提示直接让模型生成含引用的答案。

实现代码

xml_system = """
You're an assistant providing answers and citations for user questions. Use the following format:
<cited_answer>
    <answer></answer>
    <citations>
        <citation><source_id></source_id><quote></quote></citation>
    </citations>
</cited_answer>
"""
xml_prompt = ChatPromptTemplate.from_messages(
    [("system", xml_system), ("human", "{input}")]
)

rag_chain_from_docs = (
    RunnablePassthrough.assign(context=(lambda x: format_docs_xml(x["context"])))
    | xml_prompt
    | llm
    | XMLOutputParser()
)

result = rag_chain_from_docs.invoke({"input": "How fast are cheetahs?"})
print(result)

优势

无需模型的特殊支持。
提示设计灵活，可适配多种场景。

方法4: 检索后处理

通过内容压缩和片段化操作，可以确保传递给模型的上下文信息简洁且相关性高，这种方法尤其适合处理长文档。

实现代码

from langchain.retrievers.document_compressors import EmbeddingsFilter
from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(chunk_size=400)
compressor = EmbeddingsFilter(embeddings=OpenAIEmbeddings(), k=10)

new_retriever = (
    RunnableParallel(question=RunnablePassthrough(), docs=retriever) | split_and_filter
)

chain = RunnablePassthrough.assign(
    context=(lambda x: x["input"]) | new_retriever
).assign(answer=rag_chain_from_docs)

result = chain.invoke({"input": "How fast are cheetahs?"})
print(result["answer"])

优势

显著减少无关内容，提升模型的生成准确性。

方法5: 生成后处理

通过两次模型调用，先生成答案，再让模型注释答案的引用。这种方法虽然开销较高，但能够提供高质量的引用。

实现代码

class AnnotatedAnswer(BaseModel):
    citations: List[Citation] = Field(...)

answer_chain = prompt | llm
annotation_chain = prompt | llm.with_structured_output(AnnotatedAnswer)

chain = (
    RunnableParallel(
        question=RunnablePassthrough(), docs=(lambda x: x["input"]) | retriever
    )
    .assign(answer=answer_chain)
    .assign(annotations=annotation_chain)
)

result = chain.invoke({"input": "How fast are cheetahs?"})
print(result["answer"])
print(result["annotations"])

优势

能够单独优化生成步骤和注释步骤。

实践建议

优先简化检索内容：特别是长文档，可以通过压缩和拆分提高生成质量。
选择适合的引用方式：根据需求决定是引用文档ID还是具体片段。
灵活调整提示设计：提示在生成任务中的重要性，不容忽视。

如果遇到问题欢迎在评论区交流。

—END—

您可能感兴趣的与本文相关的镜像

ComfyUI

AI应用

ComfyUI

ComfyUI是一款易于上手的工作流设计工具，具有以下特点：基于工作流节点设计，可视化工作流搭建，快速切换工作流，对显存占用小，速度快，支持多种插件，如ADetailer、Controlnet和AnimateDIFF等