大家好,在检索增强生成(RAG)技术崭露头角之际,业界对其赋予厚望,期待它能够推动AI迈向新的智能高度。
然而,实践中RAG暴露出诸多缺陷,极大地限制了其应用效果与AI的发展进程。在此背景下,推理增强生成(ReAG)技术应运而生。ReAG凭借其独特的技术架构与运行逻辑,为解决 RAG 问题提供新思路和可行方案,在AI升级之路上潜力巨大。
1.传统RAG的问题
传统 RAG 系统就好比记忆力差的图书管理员,看似在努力找资料,实则状况百出。
在语义搜索方面,找文档只看表面,比如“搜空气污染”,就只能搜到“汽车尾气排放”之类的内容,像《城市宜居状态研究》这种相关研究就被无视了。
基础架构的细节不足,分块、嵌入、向量数据库把流程复杂化,还容易出问题,索引过时、分割错误经常有。知识更新很慢,一些领域的数据变化迅速,RAG更新索引却很慢,新知识进不来导致无法使用。
问“北极熊为啥变少”,RAG只说“海冰融化”,关键的觅食问题却不提,这就是RAG的缺陷。
2.ReAG来袭,告别传统检索模式
RAG的问题不少,ReAG则带来全新思路。它跳过RAG的预处理流程,直接把原始材料(文本文件、电子表格、网址等)给语言模型。
大语言模型能够完整读取文档,无需分块、嵌入,文档上下文完整保留;精准筛选内容,先判断文档是否有用(相关性检查),再确定哪些部分重要(内容提取);智能合成答案,像专业人员一样整合信息,即便关键词不匹配,也能找出联系。
比如问“北极熊为啥减少”,ReAG分析《海冰的热动力学》报告时,就算没“北极熊”字样,也能找到海冰减少影响其觅食的关键内容,给出答案。
3.ReAG工作原理
ReAG具有如此多的优势,是因为具有高效精准的技术处理过程,给大家拆解一下它的技术流程。
-
直接摄取原始文档:不管是Markdown、PDF,还是网址,ReAG都不做预处理,直接使用。
-
并行分析文档:大语言模型同时对每份文档进行相关性检查和内容提取,效率超高。
-
动态合成答案:剔除不相关文档,用筛选后的内容生成答案。
ReAG 的技术流程简洁高效,具有较高的技术价值。
4.ReAG的优势与权衡
4.1 ReAG优势
-
动态数据处理快:实时新闻、市场数据这类不断变化的数据,ReAG能即时处理,无需重新嵌入,效率超高。
-
复杂查询有一手:像探究监管政策对社区银行的影响这类难题,ReAG挖掘间接联系的能力比RAG强,解题更在行。
-
多模态分析超方便:图表、表格、文本,ReAG能一起分析,还不用额外预处理。
4.2 ReAG短板
-
成本较高:处理100份文档,ReAG需调用100次大语言模型,RAG向量搜索成本则低很多。
-
大规模处理慢:面对海量文档,ReAG速度欠佳,RAG和ReAG混合使用效果更佳。
ReAG优势突出但也有局限,使用时按需选择!
5.ReAG技术栈揭秘
ReAG表现亮眼,其技术栈暗藏玄机,下面详细罗列:
5.1 技术组件解析
(1)GROQ + Llama-3.3–70B-Versatile
-
职责:负责相关性评估,初步筛选文档。
-
优势:推理快,每秒处理500多令牌;700亿参数精准评分;12.8万令牌大窗口。
-
示例:能识别无关键词重叠的《海冰的热动力学》与“北极熊减少”相关。
(2)Ollama + DeepSeek-R1:14B
-
任务:进行响应合成,推理出答案。
-
长处:轻量省钱,针对提取总结优化;可本地运行保隐私、降成本;12.8万令牌窗口。
-
应用:从文档提取关键信息,如无冰期觅食窗口变化数据。
(3)LangChain
-
功能:编排流程、实现自动化。
-
特点:并行GROQ和Ollama任务;管理文档、处理错误、聚合输出。
5.2 技术栈优势
-
成本合理:GROQ处理重任务,Ollama本地处理轻量任务,节省成本。
-
扩展性好:GROQ的LPU能处理大量并发评估。
-
灵活多变:可更换模型,无需重写管道。
处理超过50页的大规模文档,用大上下文窗口的大语言模型配合ReAG更好。
6.ReAG代码实现
6.1 安装所需依赖项和下载数据
!pip install langchain langchain_groq langchain_ollama langchain_community pymupdf pypdf
!mkdir ./data
!mkdir ./chunk_caches
!wget "https://www.binasss.sa.cr/int23/8.pdf" -O "./data/fibromyalgia.pdf"
6.2 设置大语言模型
from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama
import os
os.environ["GROQ_API_KEY"] = "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"
llm_relevancy = ChatGroq(
model="llama-3.3-70b-versatile",
temperature=0,
)
llm = ChatOllama(
model="deepseek-r1:14b",
temperature=0.6,
max_tokens=3000,
)
6.3 定义系统提示
REAG_SYSTEM_PROMPT = """
# 角色和目标
你是一个智能知识检索助手。你的任务是分析提供的文档或网址,为用户查询提取最相关的信息。
# 指令
1. 仔细分析用户的查询,确定关键概念和要求。
2. 在提供的来源中搜索相关信息,并在“content”字段中输出相关部分。
3. 如果你在文档中找不到必要的信息,返回“isIrrelevant: true”,否则返回“isIrrelevant: false”。
# 约束
- 不要超出可用数据进行假设
- 明确指出是否未找到相关信息
- 在选择来源时保持客观
"""
6.4 定义RAG提示词
rag_prompt = """你是一个问答任务助手。使用以下检索到的上下文片段来回答问题。如果你不知道答案,就说不知道。最多用三句话,保持回答简洁。
问题:{question}
上下文:{context}
答案:
"""
6.5 定义响应模式
from pydantic import BaseModel, Field
from typing import List
from langchain_core.output_parsers import JsonOutputParser
class ResponseSchema(BaseModel):
content: str = Field(..., description="文档中与回答所提问题相关或足以回答问题的页面内容")
reasoning: str = Field(..., description="针对所提问题选择该页面内容的原因")
is_irrelevant: bool = Field(..., description="如果文档中的内容不足以或与回答所提问题无关,指定为“True”;如果上下文或页面内容与回答问题相关,则指定为“False”")
class RelevancySchemaMessage(BaseModel):
source: ResponseSchema
relevancy_parser = JsonOutputParser(pydantic_object=RelevancySchemaMessage)
6.6 加载并处理输入文档
from langchain_community.document_loaders import PyMuPDFLoader
file_path = "./data/fibromyalgia.pdf"
loader = PyMuPDFLoader(file_path)
docs = loader.load()
print(len(docs))
print(docs[0].metadata)
8
{'producer': 'Acrobat Distiller 6.0 for Windows',
'creator': 'Elsevier',
'creationdate': '2023-01-20T09:25:19-06:00',
'source': './data/fibromyalgia.pdf',
'file_path': './data/fibromyalgia.pdf',
'total_pages': 8,
'format': 'PDF 1.7',
'title': 'Fibromyalgia: Diagnosis and Management',
'author': 'Bradford T. Winslow MD',
'subject': 'American Family Physician, 107 (2023) 137-144',
'keywords': '',
'moddate': '2023-02-27T15:02:12+05:30',
'trapped': '',
'modDate': "D:20230227150212+05'30'",
'creationDate': "D:20230120092519-06'00'",
'page': 0}
6.7 格式化文档的辅助函数
from langchain.schema import Document
def format_doc(doc: Document) -> str:
return f"Document_Title: {doc.metadata['title']}\nPage: {doc.metadata['page']}\nContent: {doc.page_content}"
6.8 提取相关上下文的辅助函数
from langchain_core.prompts import PromptTemplate
def extract_relevant_context(question, documents):
result = []
for doc in documents:
formatted_documents = format_doc(doc)
system = f"{REAG_SYSTEM_PROMPT}\n\n# Available source\n\n{formatted_documents}"
prompt = f"""Determine if the 'Avaiable source' content supplied is sufficient and relevant to ANSWER the QUESTION asked.
QUESTION: {question}
#INSTRUCTIONS TO FOLLOW
1. Analyze the context provided thoroughly to check its relevancy to help formulizing a response for the QUESTION asked.
2, STRICTLY PROVIDE THE RESPONSE IN A JSON STRUCTURE AS DESCRIBED BELOW:
```json
{{"content":<<The page content of the document that is relevant or sufficient to answer the question asked>>,
"reasoning":<<The reasoning for selecting The page content with respect to the question asked>>,
"is_irrelevant":<<Specify 'True' if the content in the document is not sufficient or relevant.Specify 'False' if the page content is sufficient to answer the QUESTION>>
}}
```
"""
messages =[ {"role": "system", "content": system},
{"role": "user", "content": prompt},
]
response = llm_relevancy.invoke(messages)
print(response.content)
formatted_response = relevancy_parser.parse(response.content)
result.append(formatted_response)
final_context = []
for items in result:
if (items['is_irrelevant'] == False) or ( items['is_irrelevant'] == 'false') or (items['is_irrelevant'] == 'False'):
final_context.append(items['content'])
return final_context
question = "What is Fibromyalgia?"
final_context = extract_relevant_context(question, docs)
print(len(final_context))
6.9 生成响应的辅助函数
def generate_response(question, final_context):
prompt = PromptTemplate(template=rag_prompt,
input_variables=["question","context"],)
chain = prompt | llm
response = chain.invoke({"question":question,"context":final_context})
print(response.content.split("\n\n")[-1])
return response.content.split("\n\n")[-1]
6.10 生成响应
final_response = generate_response(question, final_context)
final_response
'Fibromyalgia is a chronic condition characterized by widespread musculoskeletal pain, fatigue, disrupted sleep, and cognitive difficulties like "fibrofog." It is often associated with heightened sensitivity to pain due to altered nervous system processing. Diagnosis considers symptoms such as long-term pain, fatigue, and sleep issues without underlying inflammation or injury.'
输出结果:
['Duloxetine, milnacipran, pregabalin, and amitriptyline are potentially effective medications for fibromyalgia. Nonsteroidal anti-inflammatory drugs and opioids have not demonstrated benefits for fibromyalgia and have significant limitations.',
'Amitriptyline, cyclobenzaprine, duloxetine (Cymbalta), milnacipran (Savella), and pregabalin (Lyrica) are effective for pain in fibromyalgia.43,46-48,50,52,54',
'Amitriptyline (tricyclic antidepressant) - 5 to 10 mg at night, 20 to 30 mg at night. Cyclobenzaprine (muscle relaxant; tricyclic derivative) - 5 to 10 mg at night, 10 to 40 mg daily in 1 to 3 divided doses. Duloxetine (Cymbalta; serotonin-norepinephrine reuptake inhibitor) - 20 to 30 mg every morning, 60 mg every morning. Milnacipran (Savella; serotonin-norepinephrine reuptake inhibitor) - 12.5 mg every morning, 50 mg twice daily. Pregabalin (Lyrica; gabapentinoid) - 25 to 50 mg at bedtime, 150 to 450 mg at bedtime.',
'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.',
'Fibromyalgia is often treated with medications such as pregabalin (Lyrica) and duloxetine (Cymbalta). Pregabalin can be started at a dose of 75 mg twice daily, with a maximum dose of 450 mg/day. Duloxetine can be initiated at a dose of 30 mg once daily, with a target dose of 60 mg/day.']
最终响应:
The medications commonly used to treat fibromyalgia include:
1. **Amitriptyline**: A tricyclic antidepressant typically taken at night in doses ranging from 5 to 30 mg.
2. **Cyclobenzaprine**: A muscle relaxant and tricyclic derivative, usually administered in doses up to 40 mg daily in divided doses.
3. **Duloxetine (Cymbalta)**: A serotonin-norepinephrine reuptake inhibitor taken in the morning, starting at 20-30 mg and increasing to 60 mg if needed.
4. **Milnacipran (Savella)**: Another serotonin-norepinephrine reuptake inhibitor, starting at 12.5 mg in the morning and potentially increased to 50 mg twice daily.
5. **Pregabalin (Lyrica)**: A gabapentinoid taken at bedtime, beginning with 75 mg twice daily and up to a maximum of 450 mg/day.
These medications are effective for managing pain associated with fibromyalgia. It's important to note that dosages should be adjusted under medical supervision, starting low and increasing as necessary. Additionally, NSAIDs and opioids are not recommended for treating fibromyalgia due to limited effectiveness and potential side effects.
7.ReAG的未来发展
-
混合系统:先使用检索增强生成(RAG)进行初步筛选,然后利用推理增强生成(ReAG)进行深度分析。
-
低成本模型:开源大语言模型(如 DeepSeek)和量化技术将降低成本。
-
更大的上下文窗口:未来的模型将能够处理包含十亿个标记的文档,这会使推理增强生成(ReAG)更加强大。